IO#

Input/output helpers and format conversion utilities.

Core conversion#

synkit.IO.chem_converter.detect_its_format(graph: Graph) Literal['typesGH', 'tuple'][source]#

Detect the ITS storage representation used by a graph.

Legacy ITS graphs keep scalar node attributes and store side-specific values only in typesGH. Tuple ITS graphs store direct paired node and edge attributes such as element=("C", "C") or sigma_order=(1.0, 1.0).

Parameters:

graph (nx.Graph) – ITS-like graph to inspect.

Returns:

Detected ITS format.

Return type:

ITSFormat

synkit.IO.chem_converter.dfs_to_smiles(dfs: str, keep_map: bool = True) str[source]#

Convert DFS-style annotated SMILES to normal SMILES form.

Rules: - Replace [] with [*]. - Convert bracketed tokens followed by digits, such as [H]12,

into atom-mapped tokens [H:12] when keep_map=True.

  • If keep_map=False, remove trailing digits instead.

  • Tokens already containing : inside brackets are preserved.

Parameters:
  • dfs (str) – DFS-style SMILES or reaction SMILES.

  • keep_map (bool) – Whether to keep atom maps.

Returns:

Converted SMILES string.

Return type:

str

synkit.IO.chem_converter.gml_to_its(gml: str) Graph[source]#

Convert a GML reaction rule back into an ITS graph.

Parameters:

gml (str) – GML string.

Returns:

ITS graph.

Return type:

nx.Graph

synkit.IO.chem_converter.gml_to_smart(gml: str, sanitize: bool = True, explicit_hydrogen: bool = False, useSmiles: bool = True) str[source]#

Convert a GML reaction rule back to reaction SMILES or SMARTS.

Parameters:
  • gml (str) – GML string.

  • sanitize (bool) – If True, sanitize during SMILES generation.

  • explicit_hydrogen (bool) – If True, keep explicit hydrogens.

  • useSmiles (bool) – If True, return reaction SMILES. Otherwise, return reaction SMARTS.

Returns:

Reaction SMILES or SMARTS string.

Return type:

str

Raises:

ValueError – If conversion fails.

synkit.IO.chem_converter.graph_to_rsmi(r: Graph, p: Graph, its: Graph | None = None, sanitize: bool = True, explicit_hydrogen: bool = False) str | None[source]#

Convert reactant and product graphs into a reaction SMILES string.

Parameters:
  • r (networkx.Graph) – Graph representing the reactants.

  • p (networkx.Graph) – Graph representing the products.

  • its (networkx.Graph or None) – Imaginary transition state graph. If None, it will be constructed.

  • sanitize (bool) – Whether to sanitize molecules during conversion.

  • explicit_hydrogen (bool) – Whether to preserve explicit hydrogens in the SMILES.

Returns:

Reaction SMILES string in ‘reactants>>products’ format or None on failure.

Return type:

str or None

synkit.IO.chem_converter.graph_to_smi(graph: Graph, sanitize: bool = True, preserve_atom_maps: Sequence[int] | None = None) str | None[source]#

Convert a molecular graph to a SMILES string.

Parameters:
  • graph (nx.Graph) – Molecular graph.

  • sanitize (bool) – If True, sanitize the generated molecule.

  • preserve_atom_maps (Optional[Sequence[int]]) – Atom-map numbers whose hydrogens should remain explicit.

Returns:

SMILES string or None on failure.

Return type:

Optional[str]

synkit.IO.chem_converter.its_to_gml(its: Graph, core: bool = True, rule_name: str = 'rule', reindex: bool = True, explicit_hydrogen: bool = False, format: Literal['typesGH', 'tuple'] = 'typesGH') str[source]#

Convert an ITS graph to GML format.

Parameters:
  • its (nx.Graph) – ITS graph.

  • core (bool) – If True, export the reaction-center graph.

  • rule_name (str) – Rule name stored in the GML output.

  • reindex (bool) – If True, reindex graph nodes before export.

  • explicit_hydrogen (bool) – If True, include explicit hydrogens.

  • format (ITSFormat) – ITS format.

Returns:

GML representation of the reaction.

Return type:

str

synkit.IO.chem_converter.its_to_rsmi(its: Graph, sanitize: bool = True, explicit_hydrogen: bool = False, clean_wildcards: bool = False, format: Literal['typesGH', 'tuple'] = 'typesGH') str[source]#

Convert an ITS graph into a reaction SMILES (rSMI) string.

This function decomposes or reverts the ITS graph into reactant and product graphs depending on the selected ITS format, then serializes them into a reaction SMILES string.

Parameters:
  • its (nx.Graph) – ITS graph to convert back into reaction SMILES.

  • sanitize (bool) – If True, sanitize graphs before SMILES generation.

  • explicit_hydrogen (bool) – If True, include explicit hydrogens in the generated SMILES.

  • clean_wildcards (bool) – If True, clean wildcard radicals in the generated reaction SMILES.

  • format (ITSFormat) – ITS format. Supported values are "typesGH" and "tuple".

Returns:

Reaction SMILES string.

Return type:

str

Raises:

ValueError – If the ITS format is unsupported.

synkit.IO.chem_converter.normalize_dfs_for_compare(dfs: str) str[source]#

Normalize DFS-style strings for comparison.

Parameters:

dfs (str) – DFS-style string.

Returns:

Normalized comparison string.

Return type:

str

synkit.IO.chem_converter.rsmarts_to_rsmi(rsmarts: str) str[source]#

Convert reaction SMARTS to reaction SMILES.

Parameters:

rsmarts (str) – Reaction SMARTS input.

Returns:

Reaction SMILES string.

Return type:

str

Raises:

ValueError – If conversion fails.

synkit.IO.chem_converter.rsmi_to_graph(rsmi: str, drop_non_aam: bool = True, sanitize: bool = True, use_index_as_atom_map: bool = True, node_attrs: Sequence[str] | None = None, edge_attrs: Sequence[str] | None = None) tuple[Graph | None, Graph | None][source]#

Convert a reaction SMILES into reactant and product graphs.

Parameters:
  • rsmi (str) – Reaction SMILES string in reactants>>products format.

  • drop_non_aam (bool) – If True, drop atoms lacking atom maps.

  • sanitize (bool) – If True, sanitize molecules during conversion.

  • use_index_as_atom_map (bool) – If True, overwrite atom-map labels using atom indices.

  • node_attrs (Optional[Sequence[str]]) – Node attributes to export into the graphs.

  • edge_attrs (Optional[Sequence[str]]) – Edge attributes to export into the graphs.

Returns:

Tuple of reactant and product graphs.

Return type:

tuple[Optional[nx.Graph], Optional[nx.Graph]]

synkit.IO.chem_converter.rsmi_to_its(rsmi: str, drop_non_aam: bool = True, sanitize: bool = True, use_index_as_atom_map: bool = True, core: bool = False, node_attrs: Sequence[str] | None = None, edge_attrs: Sequence[str] | None = None, explicit_hydrogen: bool = False, format: Literal['typesGH', 'tuple'] = 'typesGH') Graph[source]#

Convert a reaction SMILES into an ITS graph.

Supported formats:

  • "typesGH": legacy ITS representation

  • "tuple": paired-attribute ITS representation

Parameters:
  • rsmi (str) – Reaction SMILES string.

  • drop_non_aam (bool) – If True, discard fragments lacking atom maps.

  • sanitize (bool) – If True, sanitize molecules during conversion.

  • use_index_as_atom_map (bool) – If True, overwrite atom maps using atom indices.

  • core (bool) – If True, return only the reaction-center graph.

  • node_attrs (Optional[Sequence[str]]) – Node attributes to include in graph construction.

  • edge_attrs (Optional[Sequence[str]]) – Edge attributes to include in graph construction.

  • explicit_hydrogen (bool) – If True, convert implicit hydrogens to explicit nodes for the selected ITS format.

  • format (ITSFormat) – ITS format.

Returns:

ITS graph or RC graph.

Return type:

nx.Graph

Raises:

ValueError – If graph construction fails.

synkit.IO.chem_converter.rsmi_to_rsmarts(rsmi: str) str[source]#

Convert mapped reaction SMILES to reaction SMARTS.

Parameters:

rsmi (str) – Reaction SMILES input.

Returns:

Reaction SMARTS string.

Return type:

str

Raises:

ValueError – If conversion fails.

synkit.IO.chem_converter.smart_to_gml(smart: str, core: bool = True, sanitize: bool = True, rule_name: str = 'rule', reindex: bool = False, explicit_hydrogen: bool = False, useSmiles: bool = True) str[source]#

Convert a reaction SMARTS or SMILES string into GML.

This function uses the legacy ITS/GML pipeline.

Parameters:
  • smart (str) – Reaction SMARTS or SMILES string.

  • core (bool) – If True, export only the reaction core.

  • sanitize (bool) – If True, sanitize molecules during conversion.

  • rule_name (str) – Rule name stored in the GML output.

  • reindex (bool) – If True, reindex graph nodes before export.

  • explicit_hydrogen (bool) – If True, include explicit hydrogens.

  • useSmiles (bool) – If True, treat input as reaction SMILES. Otherwise, treat it as reaction SMARTS.

Returns:

GML representation of the reaction rule.

Return type:

str

Raises:

ValueError – If graph construction fails.

synkit.IO.chem_converter.smiles_to_dfs(smiles: str) str[source]#

Convert SMILES with atom maps into DFS-style notation.

Rules: - [X:123] becomes [X]123 - [*:3] becomes []3 - unmapped tokens remain unchanged - remaining [*] is normalized back to []

Parameters:

smiles (str) – SMILES or reaction SMILES.

Returns:

DFS-style string.

Return type:

str

synkit.IO.chem_converter.smiles_to_graph(smiles: str, drop_non_aam: bool = False, sanitize: bool = True, use_index_as_atom_map: bool = False, node_attrs: Sequence[str] | None = None, edge_attrs: Sequence[str] | None = None) Graph | None[source]#

Convert a SMILES string to a molecular graph.

Parameters:
  • smiles (str) – SMILES representation of the molecule.

  • drop_non_aam (bool) – If True, drop atoms without atom-map labels.

  • sanitize (bool) – If True, sanitize the RDKit molecule.

  • use_index_as_atom_map (bool) – If True, overwrite atom-map labels using atom indices.

  • node_attrs (Optional[Sequence[str]]) – Node attributes to export into the graph.

  • edge_attrs (Optional[Sequence[str]]) – Edge attributes to export into the graph.

Returns:

Molecular graph or None on failure.

Return type:

Optional[nx.Graph]

class synkit.IO.mol_to_graph.MolToGraph(node_attrs: List[str] | None = None, edge_attrs: List[str] | None = None, *, attr_profile: str = 'minimal', with_topology: bool = False)[source]#

Bases: object

Convert an RDKit molecule into a NetworkX molecular graph.

The converter preserves the public API while adding corrected lone-pair bookkeeping for aromatic heteroatoms, especially pyrrolic / [nH]-like aromatic nitrogen. RDKit aromatic bonds have order 1.5; for aromatic lone-pair donor heteroatoms, this class counts aromatic bonds as sigma bonds during lone-pair estimation.

Important node fields are estimated_lone_pairs, lone_pairs backward-compatible alias, available_lone_pairs, available_lp, bond_order_sum, lp_bond_order_sum, valence_electrons, and oxidation_state.

Parameters:
  • node_attrs (Optional[List[str]]) – Optional whitelist of node attributes to keep.

  • edge_attrs (Optional[List[str]]) – Optional whitelist of edge attributes to keep.

  • attr_profile (str) – Atom feature profile, either "minimal" or "full".

  • with_topology (bool) – If True, run GraphAnnotator on the graph.

Raises:

ValueError – If attr_profile is unsupported.

from rdkit import Chem
from synkit.IO.mol_to_graph import MolToGraph

mol = Chem.MolFromSmiles("c1cc[nH]c1")
graph = MolToGraph(attr_profile="minimal").transform(mol)

for node, data in graph.nodes(data=True):
    print(node, data["element"], data["lone_pairs"], data["available_lp"])
mol = Chem.MolFromSmiles("[CH3:1][CH2:2][Br:3]")
graph = MolToGraph(
    node_attrs=["element", "atom_map", "charge", "lone_pairs"],
    edge_attrs=["order", "kekule_order"],
).transform(mol, use_index_as_atom_map=True)
PAULING_EN: Dict[str, float] = {'B': 2.04, 'Br': 2.96, 'C': 2.55, 'Cl': 3.16, 'F': 3.98, 'H': 2.2, 'I': 2.66, 'N': 3.04, 'O': 3.44, 'P': 2.19, 'S': 2.58, 'Se': 2.55}#
SUPPORTED_PROFILES = ('minimal', 'full')#
static add_partial_charges(mol: Mol) None[source]#

Compute Gasteiger partial charges in-place.

Parameters:

mol (Chem.Mol) – RDKit molecule to modify.

Returns:

None.

Return type:

None

classmethod estimate_available_lone_pairs(atom: Atom) int[source]#

Estimate lone pairs locally available for LP-/B+ donation.

Parameters:

atom (Chem.Atom) – RDKit atom.

Returns:

Locally available lone-pair count.

Return type:

int

classmethod estimate_lone_pairs(atom: Atom) int[source]#

Estimate total lone-pair count.

Parameters:

atom (Chem.Atom) – RDKit atom.

Returns:

Estimated total lone-pair count.

Return type:

int

mol = Chem.MolFromSmiles("c1cc[nH]c1")
n_atom = next(a for a in mol.GetAtoms() if a.GetSymbol() == "N")
print(MolToGraph.estimate_lone_pairs(n_atom))
classmethod estimate_oxidation_states(mol: Mol, *, kek_mol: Mol | None = None, prefer_kekule: bool = True, en_tie_threshold: float = 0.05) Dict[int, float][source]#

Estimate atom oxidation states.

For each bond, bond electrons are assigned to the more electronegative atom. Formal charge is used as the starting value.

Parameters:
  • mol (Chem.Mol) – RDKit molecule.

  • kek_mol (Optional[Chem.Mol]) – Optional kekulized copy of mol.

  • prefer_kekule (bool) – Whether to prefer kekulized bond orders.

  • en_tie_threshold (float) – Electronegativity-difference threshold for treating a bond as a tie.

Returns:

Oxidation states keyed by RDKit atom index.

Return type:

Dict[int, float]

static get_bond_stereochemistry(bond: Bond) str[source]#

Return E, Z, or N for double-bond stereochemistry.

Parameters:

bond (Chem.Bond) – RDKit bond.

Returns:

Simple bond stereochemistry label.

Return type:

str

static get_stereochemistry(atom: Atom) str[source]#

Return S, R, or N from the RDKit chiral tag.

Parameters:

atom (Chem.Atom) – RDKit atom.

Returns:

Simple atom stereochemistry label.

Return type:

str

property graph: Graph#

Return the graph produced by transform_store().

Returns:

Stored molecular graph.

Return type:

nx.Graph

Raises:

RuntimeError – If no graph has been stored yet.

static has_atom_mapping(mol: Mol) bool[source]#

Return whether any atom has a non-zero atom-map number.

Parameters:

mol (Chem.Mol) – RDKit molecule.

Returns:

True if mapped.

Return type:

bool

classmethod help() str[source]#

Return a short usage string.

Returns:

Usage summary.

Return type:

str

classmethod mol_to_graph(mol: Mol, drop_non_aam: bool = False, light_weight: bool = False, use_index_as_atom_map: bool = False) Graph[source]#

Backward-compatible graph converter.

New code should usually prefer transform().

Parameters:
  • mol (Chem.Mol) – RDKit molecule.

  • drop_non_aam (bool) – If True, remove atoms with atom-map 0.

  • light_weight (bool) – If True, use reduced attributes.

  • use_index_as_atom_map (bool) – If True, use atom maps as node IDs.

Returns:

Molecular graph.

Return type:

nx.Graph

Raises:

ValueError – If drop_non_aam=True but use_index_as_atom_map=False.

mol = Chem.MolFromSmiles("[CH3:1][CH2:2][Br:3]")
graph = MolToGraph.mol_to_graph(
    mol,
    drop_non_aam=True,
    light_weight=True,
    use_index_as_atom_map=True,
)
classmethod oxidation_states_by_atom_map(mol: Mol, *, kek_mol: Mol | None = None, prefer_kekule: bool = True, en_tie_threshold: float = 0.05) Dict[int, Dict[str, Any]][source]#

Return oxidation states keyed by non-zero atom-map number.

Parameters:
  • mol (Chem.Mol) – Mapped RDKit molecule.

  • kek_mol (Optional[Chem.Mol]) – Optional kekulized copy.

  • prefer_kekule (bool) – Whether to prefer kekulized bond orders.

  • en_tie_threshold (float) – Electronegativity tie threshold.

Returns:

Oxidation-state records keyed by atom-map number.

Return type:

Dict[int, Dict[str, Any]]

static random_atom_mapping(mol: Mol) Mol[source]#

Assign random atom-map numbers from 1 to n in-place.

Parameters:

mol (Chem.Mol) – RDKit molecule to mutate.

Returns:

Same molecule with assigned atom-map numbers.

Return type:

Chem.Mol

classmethod reaction_oxidation_state_delta_from_rsmi(rsmi: str, *, threshold: float = 0.5, prefer_kekule: bool = True, en_tie_threshold: float = 0.05) Dict[int, Dict[str, Any]][source]#

Compute oxidation-state changes for mapped reaction SMILES.

Positive delta means oxidation; negative delta means reduction.

Parameters:
  • rsmi (str) – Mapped reaction SMILES containing ">>".

  • threshold (float) – Minimum absolute delta to report.

  • prefer_kekule (bool) – Whether to prefer kekulized bond orders.

  • en_tie_threshold (float) – Electronegativity tie threshold.

Returns:

Significant oxidation-state changes keyed by atom map.

Return type:

Dict[int, Dict[str, Any]]

Raises:

ValueError – If rsmi lacks ">>".

rsmi = "[CH3:1][OH:2]>>[CH2:1]=[O:2]"
print(MolToGraph.reaction_oxidation_state_delta_from_rsmi(rsmi))
transform(mol: Mol, drop_non_aam: bool = False, use_index_as_atom_map: bool = False) Graph[source]#

Build a NetworkX graph from an RDKit molecule.

Parameters:
  • mol (Chem.Mol) – RDKit molecule.

  • drop_non_aam (bool) – If True, exclude atoms with atom-map 0.

  • use_index_as_atom_map (bool) – If True, use non-zero atom-map numbers as node identifiers; otherwise use atom index + 1.

Returns:

Molecular graph with atom and bond attributes.

Return type:

nx.Graph

Raises:

ValueError – If drop_non_aam=True but use_index_as_atom_map=False.

mol = Chem.MolFromSmiles("[CH3:1][CH2:2][Br:3]")
graph = MolToGraph().transform(
    mol,
    drop_non_aam=True,
    use_index_as_atom_map=True,
)
transform_store(mol: Mol, drop_non_aam: bool = False, use_index_as_atom_map: bool = False) MolToGraph[source]#

Build, store, and return self.

Parameters:
  • mol (Chem.Mol) – RDKit molecule.

  • drop_non_aam (bool) – If True, exclude atoms with atom-map 0.

  • use_index_as_atom_map (bool) – If True, use atom maps as node IDs.

Returns:

Current converter instance.

Return type:

MolToGraph

class synkit.IO.graph_to_mol.GraphToMol(node_attributes: Dict[str, str] = {'atom_map': 'atom_map', 'charge': 'charge', 'element': 'element'}, edge_attributes: Dict[str, str] = {'order': 'order'})[source]#

Bases: object

Converts a NetworkX graph representation of a molecule into an RDKit molecule object.

This class reconstructs RDKit molecules from node and edge attributes in a graph, correctly interpreting atom types, charges, mapping numbers, bond orders, and optionally explicit hydrogen counts.

Parameters:
  • node_attributes (Dict[str, str]) – Mapping of expected attribute names to node keys in the graph. For example, {“element”: “element”, “charge”: “charge”, “atom_map”: “atom_map”}.

  • edge_attributes (Dict[str, str]) – Mapping of expected attribute names to edge keys in the graph. For example, {“order”: “order”}.

static get_bond_type_from_order(order: float) BondType[source]#

Converts a numerical bond order into the corresponding RDKit BondType.

Parameters:

order (float) – The numerical bond order (typically 1, 2, or 3).

Returns:

The corresponding RDKit bond type (single, double, triple, or aromatic).

Return type:

Chem.BondType

graph_to_mol(graph: Graph, ignore_bond_order: bool = False, sanitize: bool = True, use_h_count: bool = False) Mol[source]#

Converts a NetworkX graph into an RDKit molecule.

Parameters:
  • graph (nx.Graph) – The NetworkX graph representing the molecule.

  • ignore_bond_order (bool) – If True, all bonds are created as single bonds regardless of edge attributes. Defaults to False.

  • sanitize (bool) – If True, the resulting RDKit molecule will be sanitized after construction. Defaults to True.

  • use_h_count (bool) – If True, the ‘hcount’ attribute (if present) will be used to set explicit hydrogen counts on atoms. Defaults to False.

Returns:

An RDKit molecule constructed from the graph’s nodes and edges.

Return type:

Chem.Mol

class synkit.IO.gml_to_nx.GMLToNX(gml_text: str)[source]#

Bases: object

Parses GML-like text and transforms it into three NetworkX graphs representing the left, right, and context graphs of a chemical reaction step.

Parameters:

gml_text (str) – The GML-like text to parse.

Variables:

graphs (dict[str, nx.Graph]) – A dictionary containing ‘left’, ‘right’, and ‘context’ NetworkX graphs.

transform() Tuple[Graph, Graph, Graph][source]#

Transforms the GML-like text into three NetworkX graphs: left, right, and context.

Returns:

A tuple of (left_graph, right_graph, context_graph), each a NetworkX graph.

Return type:

tuple[nx.Graph, nx.Graph, nx.Graph]

class synkit.IO.nx_to_gml.NXToGML[source]#

Bases: object

Converts NetworkX graph representations of chemical reactions to GML (Graph Modelling Language) strings. Useful for exporting reaction rules in a standard graph format.

This class provides static methods for converting individual graphs, sets of reaction graphs, and managing charge/attribute changes in the export process.

static transform(graph_rules: Tuple[Graph, Graph, Graph], rule_name: str = 'Test', reindex: bool = False, attributes: List[str] = ['charge'], explicit_hydrogen: bool = False) str[source]#

Processes a triple of reaction graphs to generate a GML string rule, with options for node reindexing and explicit hydrogen expansion.

Parameters:
  • graph_rules (tuple[nx.Graph, nx.Graph, nx.Graph]) – Tuple containing (L, R, K) reaction graphs.

  • rule_name (str) – The rule name to use in the output.

  • reindex (bool) – Whether to reindex node IDs based on the L graph sequence.

  • attributes (list[str]) – List of attribute names to check for node changes.

  • explicit_hydrogen (bool) – Whether to explicitly include hydrogen atoms in the output.

Returns:

The GML string representing the chemical rule.

Return type:

str

Data and debug#

synkit.IO.data_io.collect_data(num_batches: int, temp_dir: str, file_template: str) List[Any][source]#

Collects and aggregates data from multiple pickle files into a single list.

Parameters:
  • num_batches (int) – The number of batch files to process.

  • temp_dir (str) – The directory where the batch files are stored.

  • file_template (str) – The template string for batch file names, expecting an integer formatter.

Returns:

A list of aggregated data items from all batch files.

Return type:

list

synkit.IO.data_io.load_compressed(filename: str) ndarray[source]#

Loads a NumPy array from a compressed .npz file.

Parameters:

filename (str) – The path of the .npz file to load.

Returns:

The loaded NumPy array.

Return type:

numpy.ndarray

Raises:

KeyError – If the .npz file does not contain an array with the key ‘array’.

synkit.IO.data_io.load_database(pathname: str = './Data/database.json') List[Dict][source]#

Load a database (a list of dictionaries) from a JSON file.

Parameters:

pathname (str) – The path from where the database will be loaded. Defaults to ‘./Data/database.json’.

Returns:

The loaded database.

Return type:

list[dict]

Raises:

ValueError – If there is an error reading the file.

synkit.IO.data_io.load_dg(path: str, graph_db: list, rule_db: list)[source]#

Load a DG instance from a dumped file.

Parameters:
  • path (str) – The file path of the dumped graph.

  • graph_db (list) – List of Graph objects representing the graph database.

  • rule_db (list) – List of Rule objects required for loading the DG.

Returns:

The loaded derivation graph instance.

Return type:

DG

Raises:

Exception – If loading fails.

synkit.IO.data_io.load_dict_from_json(file_path: str) dict | None[source]#

Load a dictionary from a JSON file.

Parameters:

file_path (str) – The path to the JSON file from which to load the dictionary.

Returns:

The dictionary loaded from the JSON file, or None if an error occurs.

Return type:

dict or None

synkit.IO.data_io.load_from_pickle(filename: str) List[Any][source]#

Load data from a pickle file.

Parameters:

filename (str) – The name of the pickle file to load data from.

Returns:

The data loaded from the pickle file.

Return type:

list

synkit.IO.data_io.load_from_pickle_generator(file_path: str) Generator[Any, None, None][source]#

A generator that yields items from a pickle file where each pickle load returns a list of dictionaries.

Parameters:

file_path (str) – The path to the pickle file to load.

Yields:

A single item from the list of dictionaries stored in the pickle file.

Return type:

Any

synkit.IO.data_io.load_gml_as_text(gml_file_path: str) str | None[source]#

Load the contents of a GML file as a text string.

Parameters:

gml_file_path (str) – The file path to the GML file.

Returns:

The text content of the GML file, or None if the file does not exist or an error occurs.

Return type:

str or None

synkit.IO.data_io.load_list_from_file(file_path: str) list[source]#

Load a list from a JSON-formatted file.

Parameters:

file_path (str) – The path to the file to read the list from.

Returns:

The list loaded from the file.

Return type:

list

synkit.IO.data_io.load_model(filename: str) Any[source]#

Load a machine learning model from a file using joblib.

Parameters:

filename (str) – The path to the file from which the model will be loaded.

Returns:

The loaded machine learning model.

Return type:

object

synkit.IO.data_io.save_compressed(array: ndarray, filename: str) None[source]#

Saves a NumPy array in a compressed format using .npz extension.

Parameters:
  • array (numpy.ndarray) – The NumPy array to be saved.

  • filename (str) – The file path or name to save the array to, with a ‘.npz’ extension.

synkit.IO.data_io.save_database(database: List[Dict], pathname: str = './Data/database.json') None[source]#

Save a database (a list of dictionaries) to a JSON file.

Parameters:
  • database (list[dict]) – The database to be saved.

  • pathname (str) – The path where the database will be saved. Defaults to ‘./Data/database.json’.

Raises:
  • TypeError – If the database is not a list of dictionaries.

  • ValueError – If there is an error writing the file.

synkit.IO.data_io.save_dg(dg, path: str) str[source]#

Save a DG instance to disk using MØD’s dump method.

Parameters:
  • dg (DG) – The derivation graph to save.

  • path (str) – The file path where the graph will be dumped.

Returns:

The path of the dumped file.

Return type:

str

Raises:

Exception – If saving fails.

synkit.IO.data_io.save_dict_to_json(data: dict, file_path: str) None[source]#

Save a dictionary to a JSON file.

Parameters:
  • data (dict) – The dictionary to be saved.

  • file_path (str) – The path to the file where the dictionary should be saved.

synkit.IO.data_io.save_list_to_file(data_list: list, file_path: str) None[source]#

Save a list to a file in JSON format.

Parameters:
  • data_list (list) – The list to save.

  • file_path (str) – The path to the file where the list will be saved.

synkit.IO.data_io.save_model(model: Any, filename: str) None[source]#

Save a machine learning model to a file using joblib.

Parameters:
  • model (object) – The machine learning model to save.

  • filename (str) – The path to the file where the model will be saved.

synkit.IO.data_io.save_text_as_gml(gml_text: str, file_path: str) bool[source]#

Save a GML text string to a file.

Parameters:
  • gml_text (str) – The GML content as a text string.

  • file_path (str) – The file path where the GML text will be saved.

Returns:

True if saving was successful, False otherwise.

Return type:

bool

synkit.IO.data_io.save_to_pickle(data: List[Dict[str, Any]], filename: str) None[source]#

Save a list of dictionaries to a pickle file.

Parameters:
  • data (list[dict]) – A list of dictionaries to be saved.

  • filename (str) – The name of the file where the data will be saved.

synkit.IO.debug.configure_warnings_and_logs(ignore_warnings: bool = False, disable_rdkit_logs: bool = False) None[source]#

Configures Python warnings and RDKit log behavior based on input flags.

Parameters:
  • ignore_warnings (bool) – Whether to suppress all Python warnings. Default is False.

  • disable_rdkit_logs (bool) – Whether to disable RDKit error and warning logs. Default is False.

Returns:

None :usage: Use this function to control verbosity (e.g. in production or testing), but use with caution during development to avoid missing critical issues.

synkit.IO.debug.setup_logging(log_level: str = 'INFO', log_filename: str = None, task_type: str = None) Logger[source]#

Configures logging to either the console or a file, based on provided parameters.

Parameters:
  • log_level (str) – Logging level to set. Defaults to ‘INFO’. Options: ‘DEBUG’, ‘INFO’, ‘WARNING’, ‘ERROR’, ‘CRITICAL’.

  • log_filename (str or None) – If provided, logs are written to this file. Defaults to None (logs to console).

  • task_type (str or None) – Logger name/namespace. Useful for distinguishing loggers in multi-task settings. Defaults to None.

Returns:

Configured logger instance.

Return type:

logging.Logger

Raises:

ValueError – If an invalid log level is provided.