Extended technology mapping

Header: mockturtle/algorithms/emap.hpp

The command emap stands for extended mapper. It supports large library cells, of more than 6 inputs, and can perform matching using 3 different methods: Boolean, pattern, or hybrid. The current version can map to 2-output gates, such as full adders and half adders, and provides a 2x speedup in mapping time compared to command map for similar or better quality. Similarly, to map, the implementation is independent of the underlying graph representation. Additionally, emap supports “don’t touch” white boxes (gates).

Command emap can return the mapped network in two formats. Command emap returns a cell_view<block_network> that supports multi-output cells. Command emap_klut returns a binding_view<klut_network> similarly as command map.

The following example shows how to perform delay-oriented technology mapping from an and-inverter graph using large cells up to 9 inputs:

aig_network aig = ...;

/* read cell library in genlib format */
std::vector<gate> gates;
std::ifstream in( ... );
lorina::read_genlib( in, genlib_reader( gates ) )
tech_library<9> tech_lib( gates );

/* perform technology mapping */
cell_view<block_network> res = emap<9>( aig, tech_lib );

The next example performs area-oriented graph mapping using multi-output cells:

aig_network aig = ...;

/* read cell library in genlib format */
std::vector<gate> gates;
std::ifstream in( ... );
lorina::read_genlib( in, genlib_reader( gates ) )
tech_library tech_lib( gates );

/* perform technology mapping */
emap_params ps;
ps.area_oriented_mapping = true;
ps.map_multioutput = true;
cell_view<block_network> res = emap( aig, tech_lib, ps );

In this case, emap is used to return a block_network, which can respresent multi-output cells as single nodes. Alternatively, also emap_klut can be used but multi-output cells would be reporesented by single-output nodes.

The maximum number of cuts stored for each node is limited to 20. To increase this limit, change max_cut_num in emap.

You can set the inputs arrival time and output required times using the parameters arrival_times and required times. Moreover, it is possible to ask for a required time relaxation. For instance, if we want to map a network with an increase of 10% over its minimal delay, we can set relax_required to 10.

For further details and usage scenarios of emap, such as white boxes, please check the related tests.

Parameters and statistics

struct emap_params

Parameters for emap.

The data structure emap_params holds configurable parameters with default arguments for emap.

Public Types

enum matching_mode_t

Matching mode.

Boolean uses Boolean matching (up to 6-input cells), Structural uses pattern matching for fully-DSD cells, Hybrid combines the two.

Values:

enumerator boolean

enumerator structural

enumerator hybrid

Public Members

cut_enumeration_params cut_enumeration_ps = {}

Parameters for cut enumeration.

The default cut limit is 16. The maximum cut limit is 19. By default, truth table minimization is performed.

bool area_oriented_mapping = {false}: Do area-oriented mapping.

bool map_multioutput = {false}: Maps using multi-output gates.

double required_time = {0.0f}: Target required time (for each PO).

double relax_required = {0.0f}: Required time relaxation in percentage (10 = 10%).

std::vector<double> arrival_times = {}: Custom input arrival times.

std::vector<double> required_times = {}: Custom output required times.

uint32_t area_flow_rounds = {3u}: Number of rounds for area flow optimization.

uint32_t ela_rounds = {2u}: Number of rounds for exact area optimization.

uint32_t eswp_rounds = {0u}: Number of rounds for exact switching power optimization.

uint32_t switching_activity_patterns = {2048u}: Number of patterns for switching activity computation.

bool use_match_alternatives = {true}: Compute area-oriented alternative matches.

bool remove_dominated_cuts = {false}: Remove the cuts that are contained in others.

bool remove_overlapping_multicuts = {false}: Remove overlapping multi-output cuts.

bool verbose = {false}: Be verbose.

struct emap_stats

Statistics for emap.

The data structure emap_stats provides data collected by running emap.

Public Members

double area = {0}: Area result.

double delay = {0}: Worst delay result.

double power = {0}: Power result.

uint32_t inverters = {0}: Power result.

uint32_t multioutput_gates = {0}: Mapped multi-output gates.

stopwatch::duration time_multioutput = {0}: Runtime for multi-output matching.

stopwatch::duration time_total = {0}: Total runtime.

cut_enumeration_stats cut_enumeration_st = {}: Cut enumeration stats.

std::vector<std::string> round_stats = {}: Delay and area stats for each round.

bool mapping_error = {false}: Mapping error.

Algorithm

template<unsigned CutSize = 6u, class Ntk, unsigned NInputs, classification_type Configuration> cell_view<block_network> mockturtle::emap(Ntk const &ntk, tech_library<NInputs, Configuration> const &library, emap_params const &ps = {}, emap_stats *pst = nullptr)

Technology mapping.

This function implements a technology mapping algorithm.

The function takes the size of the cuts in the template parameter CutSize.

The function returns a block network that supports multi-output cells.

The novelties of this mapper are contained in 2 publications:

A. Tempia Calvino and G. De Micheli, “Technology Mapping Using Multi-Output Library Cells,” ICCAD, 2023.
G. Radi, A. Tempia Calvino, and G. De Micheli, “In Medio Stat Virtus: Combining Boolean and Pattern Matching,” ASP-DAC, 2024.

Required network functions:

size
is_pi
is_constant
node_to_index
index_to_node
get_node
foreach_po
foreach_node
fanout_size

Parameters:

ntk – Network
library – Technology library
ps – Mapping params
pst – Mapping statistics

template<unsigned CutSize = 6u, class Ntk, unsigned NInputs, classification_type Configuration> binding_view<klut_network> mockturtle::emap_klut(Ntk const &ntk, tech_library<NInputs, Configuration> const &library, emap_params const &ps = {}, emap_stats *pst = nullptr)

Technology mapping.

This function implements a technology mapping algorithm.

The function takes the size of the cuts in the template parameter CutSize.

The function returns a k-LUT network. Each LUT abstacts a gate of the technology library.

The novelties of this mapper are contained in 2 publications:

A. Tempia Calvino and G. De Micheli, “Technology Mapping Using Multi-Output Library Cells,” ICCAD, 2023.
G. Radi, A. Tempia Calvino, and G. De Micheli, “In Medio Stat Virtus: Combining Boolean and Pattern Matching,” ASP-DAC, 2024.

Required network functions:

size
is_pi
is_constant
node_to_index
index_to_node
get_node
foreach_po
foreach_node
fanout_size

Parameters:

ntk – Network
library – Technology library
ps – Mapping params
pst – Mapping statistics

template<unsigned CutSize = 6u, class Ntk, unsigned NInputs, classification_type Configuration> binding_view<klut_network> mockturtle::emap_node_map(Ntk const &ntk, tech_library<NInputs, Configuration> const &library, emap_params const &ps = {}, emap_stats *pst = nullptr)

Technology node mapping.

This function implements a simple technology mapping algorithm. The algorithm maps each node to the best implementation in the technology library.

Required network functions:

size
is_pi
is_constant
node_to_index
index_to_node
get_node
foreach_po
foreach_node
fanout_size
has_binding

Parameters:

ntk – Network
library – Technology library
ps – Mapping params
pst – Mapping statistics

template<class Ntk> void mockturtle::emap_load_mapping(Ntk &ntk)

Technology node mapping.

This function implements a simple technology mapping algorithm. The algorithm maps each node to the first implementation in the technology library.

The input must be a binding_view with the gates correctly loaded.

Required network functions:

size
is_pi
is_constant
node_to_index
index_to_node
get_node
foreach_po
foreach_node
fanout_size
has_binding

Parameters:: ntk – Network

Technology mapping and network conversion

Header: mockturtle/algorithms/mapper.hpp

A versatile mapper that supports technology mapping and graph mapping (optimized network conversion). The mapper is independent of the underlying graph representation. Hence, it supports generic subject graph representations (e.g., AIG, and MIG) and a generic target representation (e.g. cell library, XAG, XMG). The mapper aims at finding a good mapping with respect to delay, area, and switching power.

The mapper uses a library (hash table) to facilitate Boolean matching. For technology mapping, it needs tech_library while for graph mapping it needs exact_library. For technology mapping, the generation of both NP- and P-configurations of gates are supported. Generally, it is convenient to use NP-configurations for small or medium size cell libraries. For bigger libraries, P-configurations should perform better. You can test both the configurations to see which one has the best run time. For graph mapping, NPN classification is used instead.

The following example shows how to perform delay-oriented technology mapping from an and-inverter graph using the default settings:

aig_network aig = ...;

/* read cell library in genlib format */
std::vector<gate> gates;
std::ifstream in( ... );
lorina::read_genlib( in, genlib_reader( gates ) )
tech_library tech_lib( gates );

/* perform technology mapping */
binding_view<klut_network> res = map( aig, tech_lib );

The mapped network is returned as a binding_view that extends a k-LUT network. Each k-LUT abstracts a cell and the view contains the binding information.

The next example performs area-oriented graph mapping from AIG to MIG using a NPN resynthesis database of structures:

aig_network aig = ...;

/* load the npn database in the library */
mig_npn_resynthesis resyn{ true };
exact_library<mig_network> exact_lib( resyn );

/* perform graph mapping */
map_params ps;
ps.skip_delay_round = true;
ps.required_time = std::numeric_limits<double>::max();
mig_network res = map( aig, exact_lib, ps );

For graph mapping, we suggest reading the network directly in the target graph representation if possible (e.g. read an AIG as a MIG) since the mapping often leads to better results in this setting.

For technology mapping of sequential networks, a dedicated command seq_map should be called. Only in case of graph mapping, the command map can be used on sequential networks.

The following example shows how to perform delay-oriented technology mapping from a sequential and-inverter graph:

sequential<aig_network> aig = ...;

/* read cell library in genlib format */
std::vector<gate> gates;
std::ifstream in( ... );
lorina::read_genlib( in, genlib_reader( gates ) )
tech_library tech_lib( gates );

/* perform technology mapping */
using res_t = binding_view<sequential<klut_network>>;
res_t res = seq_map( aig, tech_lib );

The next example performs area-oriented graph mapping from a sequential AIG to a sequential MIG using a NPN resynthesis database of structures:

sequential<aig_network> aig = ...;

/* load the npn database in the library */
mig_npn_resynthesis resyn{ true };
exact_library<sequential<mig_network>> exact_lib( resyn );

/* perform graph mapping */
map_params ps;
ps.skip_delay_round = true;
ps.required_time = std::numeric_limits<double>::max();
sequential<mig_network> res = map( aig, exact_lib, ps );

The newest version of map for graph mapping or rewriting can leverage satisfiability don’t cares:

aig_network aig = ...;

/* load the npn database in the library and compute don't care classes */
mig_npn_resynthesis resyn{ true };
exact_library_params lps;
lps.compute_dc_classes = true;
exact_library<mig_network> exact_lib( resyn, lps );

/* perform area-oriented rewriting */
map_params ps;
ps.skip_delay_round = true;
ps.required_time = std::numeric_limits<double>::max();
ps.use_dont_cares = true;
mig_network res = map( aig, exact_lib, ps );

As a default setting, cut enumeration minimizes the truth tables. This helps improving the results but slows down the computation. We suggest to keep it always true. Anyhow, for a faster mapping, set the truth table minimization parameter to false. The maximum number of cuts stored for each node is limited to 49. To increase this limit, change max_cut_num in fast_network_cuts.

Parameters and statistics

struct map_params

Parameters for map.

The data structure map_params holds configurable parameters with default arguments for map.

Public Members

cut_enumeration_params cut_enumeration_ps = {}

Parameters for cut enumeration.

The default cut limit is 49. By default, truth table minimization is performed.

double required_time = {0.0f}: Required time for delay optimization.

bool skip_delay_round = {false}: Skip delay round for area optimization.

uint32_t area_flow_rounds = {1u}: Number of rounds for area flow optimization.

uint32_t ela_rounds = {2u}: Number of rounds for exact area optimization.

uint32_t eswp_rounds = {0u}: Number of rounds for exact switching power optimization.

uint32_t switching_activity_patterns = {2048u}: Number of patterns for switching activity computation.

bool enable_logic_sharing = {false}: Exploit logic sharing in exact area optimization of graph mapping.

uint32_t logic_sharing_cut_limit = {8u}: Maximum number of cuts evaluated for logic sharing.

bool use_dont_cares = {false}: Use satisfiability don’t cares for optimization.

uint32_t window_size = {12u}: Window size for don’t cares calculation.

bool verbose = {false}: Be verbose.

struct map_stats

Statistics for mapper.

The data structure map_stats provides data collected by running map.

Public Members

double area = {0}: Area result.

double delay = {0}: Worst delay result.

double power = {0}: Power result.

stopwatch::duration time_mapping = {0}: Runtime for covering.

stopwatch::duration time_total = {0}: Total runtime.

cut_enumeration_stats cut_enumeration_st = {}: Cut enumeration stats.

std::vector<std::string> round_stats = {}: Delay and area stats for each round.

bool mapping_error = {false}: Mapping error.

Algorithm

template<class Ntk, unsigned CutSize = 5u, typename CutData = cut_enumeration_tech_map_cut, unsigned NInputs, classification_type Configuration> binding_view<klut_network> mockturtle::map(Ntk const &ntk, tech_library<NInputs, Configuration> const &library, map_params const &ps = {}, map_stats *pst = nullptr)

Technology mapping.

This function implements a technology mapping algorithm. It is controlled by a template argument CutData (defaulted to cut_enumeration_tech_map_cut). The argument is similar to the CutData argument in cut_enumeration, which can specialize the cost function to select priority cuts and store additional data. The default argument gives priority firstly to the cut size, then delay, and lastly to area flow. Thus, it is more suited for delay-oriented mapping. The type passed as CutData must implement the following four fields:

uint32_t delay
float flow
uint8_t match_index
bool ignore

See include/mockturtle/algorithms/cut_enumeration/cut_enumeration_tech_map_cut.hpp for one example of a CutData type that implements the cost function that is used in the technology mapper.

The function takes the size of the cuts in the template parameter CutSize.

The function returns a k-LUT network. Each LUT abstracts a gate of the technology library.

Required network functions:

size
is_ci
is_constant
node_to_index
index_to_node
get_node
foreach_pi
foreach_po
foreach_co
foreach_node
fanout_size

The implementation of this algorithm was inspired by the mapping command map in ABC.

Parameters:

ntk – Network
library – Technology library
ps – Mapping params
pst – Mapping statistics

template<class Ntk, unsigned CutSize = 4u, typename CutData = cut_enumeration_exact_map_cut, class NtkDest, unsigned NInputs> NtkDest mockturtle::map(Ntk &ntk, exact_library<NtkDest, NInputs> const &library, map_params const &ps = {}, map_stats *pst = nullptr)

Exact mapping.

This function implements a mapping algorithm using a database of structures. It is controlled by a template argument CutData (defaulted to cut_enumeration_exact_map_cut). The argument is similar to the CutData argument in cut_enumeration, which can specialize the cost function to select priority cuts and store additional data. The default argument gives priority firstly to area flow, then delay, and lastly to the cut size. The type passed as CutData must implement the following four fields:

uint32_t delay
float flow
uint8_t match_index
bool ignore

See include/mockturtle/algorithms/cut_enumeration/cut_enumeration_exact_map_cut.hpp for one example of a CutData type that implements the cost function that is used in the technology mapper.

The function takes the size of the cuts in the template parameter CutSize.

The function returns a mapped network representation generated using the exact synthesis entries in the exact_library. This function supports also sequential networks.

Required network functions:

size
is_ci
is_constant
node_to_index
index_to_node
get_node
foreach_po
foreach_node
fanout_size

Parameters:

ntk – Network
library – Exact library
ps – Mapping params
pst – Mapping statistics