Synthesis#
Synthesis engines, multi-step search, benchmarking, and ranking utilities.
Reactor#
- class synkit.Synthesis.Reactor.batch_reactor.BatchReactor(data: List[str | Dict[str, Any]], host_key: str | None = None, *, react_engine: str = 'syn', pre_filter_engine: str | None = None, explicit_h: bool = True, implicit_temp: bool = False, strategy: str = 'bt', dedupe: bool = True, entry_n_jobs: int = 1, rule_n_jobs: int = 1, parallel_rules: bool = False, allow_nested: bool = False, cache_enabled: bool = True, cache_maxsize: int = 32768, logger: Logger | None = None, enable_logging: bool = True)[source]#
Bases:
objectParallel, cache-aware batch application of reaction rules to SMILES substrates.
- Parameters:
data (list of str or dict) – List of SMILES strings or dicts containing SMILES under host_key.
host_key (str or None) – Key to extract SMILES from dict entries (optional).
react_engine (str) – Reactor engine: ‘syn’ or ‘mod’.
pre_filter_engine (str or None) – Pre-filtering engine for rules (None to skip).
explicit_h (bool) – Use explicit hydrogens in SynReactor.
implicit_temp (bool) – Use implicit templates in SynReactor.
strategy (str) – Matching strategy for SynReactor.
dedupe (bool) – Deduplicate results per-substrate.
entry_n_jobs (int) – Number of parallel jobs for substrates.
rule_n_jobs (int) – Number of parallel jobs for rules per substrate.
parallel_rules (bool) – Enable parallelism over rules.
allow_nested (bool) – Allow nested parallelism.
cache_enabled (bool) – Enable in-process per-rule caching.
cache_maxsize (int) – Max entries in per-process cache.
logger (logging.Logger or None) – Optional custom logger.
- Raises:
ValueError – If react_engine is invalid or SMILES/rule conversion fails.
- describe() str[source]#
Return a configuration summary.
- Returns:
Human-readable settings overview.
- Return type:
- class synkit.Synthesis.Reactor.benchmark.Benchmark(data: List[Dict[str, Any]], reaction_key: str = 'reactions', *, react_engine: str = 'syn', pre_filter_engine: str | None = None, explicit_h: bool = True, implicit_temp: bool = False, strategy: str = 'bt', dedupe: bool = True, entry_n_jobs: int = 1, rule_n_jobs: int = 1, parallel_rules: bool = False, allow_nested: bool = False, cache_enabled: bool = True, cache_maxsize: int = 32768, logger: Logger | None = None, enable_logging: bool = True)[source]#
Bases:
BatchReactorExtension of BatchReactor to benchmark forward/backward application on reaction-SMILES entries.
- Parameters:
data (list of dict) – List of dicts containing reaction SMILES under reaction_key.
reaction_key (str) – Key for reaction-SMILES strings (format ‘reactants>>products’).
react_engine (str) – Reactor engine: ‘syn’ or ‘mod’.
pre_filter_engine (str or None) – Pre-filtering engine for rules (None to skip).
explicit_h (bool) – Use explicit hydrogens in SynReactor.
implicit_temp (bool) – Use implicit templates in SynReactor.
strategy (str) – Matching strategy for SynReactor.
dedupe (bool) – Deduplicate results per substrate.
entry_n_jobs (int) – Parallel jobs for substrates.
rule_n_jobs (int) – Parallel jobs for rules per substrate.
parallel_rules (bool) – Enable rule-level parallelism.
allow_nested (bool) – Allow nested parallelism.
cache_enabled (bool) – Enable per-process caching.
cache_maxsize (int) – Max cache entries before eviction.
logger (logging.Logger or None) – Optional custom logger.
- Raises:
ValueError – If reaction_key entry malformed or SMILES invalid.
- describe() str[source]#
Return detailed configuration for Benchmark, including reaction_key.
- Returns:
Multi-line summary.
- Return type:
- class synkit.Synthesis.Reactor.imba_engine.ImbaEngine(substrate: str | Graph | SynGraph, template: str | Graph | SynRule, add_wildcard: bool = True, clean_fragments: bool = False, max_frag: bool = False, invert: bool = False, canonicaliser: GraphCanonicaliser | None = None, strategy: Strategy | str = Strategy.ALL, partial: bool = False, embed_threshold: float = None, embed_pre_filter: bool = False, electron_diagnostics: bool = False)[source]#
Bases:
objectReactor for applying a SynKit reaction template to a substrate, with options for inversion, canonicalisation, strategy, partial ITS, and radical wildcard appending and fragment cleaning in products.
- Parameters:
substrate (Union[str, nx.Graph, SynGraph]) – Input substrate; SMILES string, networkx.Graph, or SynGraph.
template (Union[str, nx.Graph, SynRule]) – Reaction template; SMARTS (bracketed) string, networkx.Graph, or SynRule.
add_wildcard (bool) – If True, apply radical wildcard transform to each product SMARTS.
clean_fragments (bool) – If True, remove wildcard fragments and optionally keep max fragment.
max_frag (bool) – If True, force maximal fragment selection when cleaning.
invert (bool) – If True, apply the template in reverse (product → reactant).
canonicaliser (Optional[GraphCanonicaliser]) – Optional GraphCanonicaliser for preprocessing or postprocessing.
strategy (Union[Strategy, str]) – Enumeration strategy (Strategy enum or string).
partial (bool) – If True, perform partial ITS graph construction on results.
- fit() ImbaEngine[source]#
Apply the reaction template to the substrate, producing product SMARTS. Optionally clean wildcard fragments and add radical wildcards. Results are stored internally and self is returned.
- Returns:
self
- Return type:
ImbReactor
- Raises:
ValueError – If substrate cannot be parsed or reaction fails.
- class synkit.Synthesis.Reactor.mod_aam.MODAAM(substrate: str | List[str], rule_file: str | Path, *, invert: bool = False, strategy: str | Strategy = Strategy.BACKTRACK, verbosity: int = 0, print_results: bool = False, check_isomorphic: bool = True)[source]#
Bases:
objectRuns MØD (via MODReactor) then a full AAM/ITS post-processing pipeline.
Parameters#
- substrateUnion[str, List[str]]
Dot-delimited SMILES or list of SMILES for reactants.
- rule_fileUnion[str, Path]
GML rule file path or raw GML/SMARTS string.
- invertbool, optional
If True, apply the rule in reverse (default False).
- strategyUnion[str, Strategy], optional
Matching strategy: ALL, COMPONENT, or BACKTRACK (default BACKTRACK).
- verbosityint, optional
Verbosity for MODReactor (default 0).
- print_resultsbool, optional
If True, print the derivation graph (default False).
- check_isomorphicbool, optional
If True, deduplicate results by isomorphism (default True).
- synkit.Synthesis.Reactor.mod_aam.expand_aam(rsmi: str, rule: str) List[str][source]#
Expand Atom–Atom Mapping (AAM) for a given reaction SMARTS/SMILES (rsmi) using a pre‐sanitized GML rule string.
Parameters#
- rsmistr
Reaction SMILES/SMARTS in ‘reactants>>products’ form.
- rulestr
A GML rule string (already sanitized upstream).
Returns#
- List[str]
All reaction SMILES from MODAAM whose standardized form matches rsmi.
- class synkit.Synthesis.Reactor.mod_reactor.MODReactor(substrate: str | List[str], rule_file: str | Path, *, invert: bool = False, strategy: str | Strategy = Strategy.BACKTRACK, verbosity: int = 0, print_results: bool = False)[source]#
Bases:
objectLazy, ergonomic wrapper around the MØD toolkit’s derivation pipeline.
Workflow#
Instantiate: give substrate SMILES and a rule GML (path or string).
Call .run() to execute the reaction strategy.
Inspect results via .get_reaction_smiles(), .product_sets, .get_dg(), etc.
Attributes#
- initial_smilesList[str]
List of SMILES strings for reactants (or products, if inverted).
- rule_filePath
Filesystem path or raw GML string or raw smart with AAM for the reaction rule.
- invertbool
If True, apply the rule in reverse (products → reactants).
- strategyStrategy
One of ALL, COMPONENT, or BACKTRACK.
- verbosityint
Verbosity level for the MØD DG.apply() call.
- print_resultsbool
If True, prints the derivation graph to stdout.
- static generate_reaction_smiles(temp_results: List[List[str]], base_smiles: str, *, invert: bool = False, arrow: str = '>>', separator: str = '.') List[str][source]#
Build reaction SMILES of the form “A>>B”, where A and B swap roles if invert=True.
Parameters#
- temp_resultsList[List[str]]
Batches of product (or reactant) SMILES.
- base_smilesstr
The “other side” of the reaction: the reactant side when invert=False, or the product side when invert=True.
- invertbool
If False, generates “base_smiles>>joined_batch”; if True, generates “joined_batch>>base_smiles”.
- arrowstr
The reaction arrow to use (default “>>”).
- separatorstr
How to join multiple SMILES in a batch (default “.”).
Returns#
- List[str]
Reaction SMILES strings, one per batch.
- get_dg() None[source]#
Access the underlying derivation graph.
Returns#
- DG
The MØD derivation graph constructed during .run().
Raises#
- RuntimeError
If .run() has not yet been called.
- get_reaction_smiles() List[str][source]#
Retrieve the reaction SMILES strings (lazy).
Returns#
- List[str]
List of reaction SMILES, in “A>>B” format.
- property product_sets: List[List[str]]#
Raw product sets (lists of SMILES) before joining into full reactions.
- property reaction_smiles: List[str]#
Lazy-loaded reaction SMILES strings of form “A>>B”.
Returns#
List[str]
- run() MODReactor[source]#
Execute the chosen strategy once and return self so you can chain:
`python r = MODReactor(...).run() smiles = r.get_reaction_smiles() `
- class synkit.Synthesis.Reactor.partial_engine.PartialEngine(smi: str, template: str, electron_diagnostics: bool = False)[source]#
Bases:
objectPartial Reaction Learning Engine that applies a single‐direction (forward or backward) template transformation, injects radical wildcards, and returns a list of intermediate ITS strings.
- Parameters:
- fit(invert: bool = False) list[str][source]#
Apply the template in one direction to generate radical‐wildcarded reaction SMARTS (ITS).
Instantiates a SynReactor on the host graph and ITS.
Sets partial, implicit‐template, and explicit‐H flags.
If invert=True, runs the backward direction; otherwise forward.
Post‐processes each reaction SMARTS with RadicalWildcardAdder.
- class synkit.Synthesis.Reactor.post_syn.PostSyn(n_jobs: int = 1, verbose: int = 2, standardizer: Standardize | None = None, reaction_key: str = 'reactions', fw_key: str = 'fw', bw_key: str = 'bw')[source]#
Bases:
objectPost-processing helper for reaction data: standardize reactions and clean AAM strings, with optional parallelism, progress reporting, and filtering of incomplete reaction SMILES inside fw/bw lists. Input keys for reaction, fw, and bw are configurable.
- clean_aam(list_aam: Iterable[str], remove_radical: bool = True) List[str][source]#
Remove atom-atom mappings, optionally clean radicals, deduplicate while preserving order.
- process(data: Iterable[Dict[str, Any]], *, progress: bool = False, prefilter: Callable[[Dict[str, Any]], bool] | None = None, filter_incomplete_rxn: bool = True) List[Dict[str, Any]][source]#
Process reaction entries.
- Parameters:
data – iterable of dicts.
progress – show progress bar if True.
prefilter – predicate to pre-filter entries.
filter_incomplete_rxn – if True, drop incomplete SMILES inside fw/bw.
- Returns:
processed list with standardized reaction and cleaned fw/bw under their original keys.
- class synkit.Synthesis.Reactor.rbl_engine.RBLEngine(*, wildcard_element: str = '*', element_key: str = 'element', node_attrs: ~typing.Sequence[str] | None = None, edge_attrs: ~typing.Sequence[str] | None = None, prune_wc: bool = True, prune_automorphisms: bool = True, mcs_side: str = 'l', early_stop: bool = True, fast_paths_only: bool = False, max_mappings_per_pair: int = 1, implicit_temp: bool = True, explicit_h: bool = False, electron_diagnostics: bool = False, embed_threshold: int = 10000, reactor_cls: type = <class 'synkit.Synthesis.Reactor.syn_reactor.SynReactor'>, wildcard_adder_cls: type = <class 'synkit.Chem.Reaction.radical_wildcard.RadicalWildcardAdder'>, matcher_cls: type = <class 'synkit.Graph.Matcher.mcs_matcher.MCSMatcher'>, fuse_fn: ~typing.Callable[[~typing.Any, ~typing.Any, ~typing.Dict[~typing.Any, ~typing.Any]], ~typing.Any] = <function fuse_its_graphs>, remove_explicit_H_fn: ~typing.Callable[[str], str] = <function remove_explicit_H_from_rsmi>, rsmi_to_its_fn: ~typing.Callable[[...], ~typing.Any] = <function rsmi_to_its>, its_to_rsmi_fn: ~typing.Callable[[~typing.Any], str] = <function its_to_rsmi>, h_to_implicit_fn: ~typing.Callable[[~typing.Any], ~typing.Any] = <function h_to_implicit>, standardize_h_fn: ~typing.Callable[[~typing.Any], ~typing.Any] = <function standardize_hydrogen>, standardize_fn: ~typing.Callable[[str], str] | None = <bound method Standardize.fit of <synkit.Chem.Reaction.standardize.Standardize object>>, logger: ~logging.Logger | None = None)[source]#
Bases:
objectRadical-based linking (RBL) engine for bidirectional template application and ITS-graph fusion using wildcard-based subgraph matching.
Overview#
The RBL engine turns a reaction template (RSMI or ITS graph) into a set of fused reaction graphs that link forward and backward template applications through a wildcard-aware core. The workflow is:
Template preparation: Convert a template (RSMI or ITS graph) into a standardized ITS representation with normalized hydrogen handling.
Forward / backward application: Use
SynReactorto apply the template to a substrate (reactants or products) in forward or inverted mode, convert to RSMI, decorate with radical wildcards, and convert back to ITS.Wildcard-based fusion: For each forward/backward ITS pair, run a matcher (
MCSMatcherorApproxMCSMatcher) to detect a core overlap (ignoring wildcard regions) and fuse the graphs viafuse_its_graphs(). The fused ITS graphs are then converted back to post-processed RSMI strings.
Matching back-ends: exact vs. approximate#
The engine delegates ITS matching to
matcher_cls, which is assumed to be API-compatible withMCSMatcher:MCSMatcher(default)Exhaustive maximum-common-subgraph search based on
networkx.algorithms.isomorphism.GraphMatcher.Respects :paramref:`prune_wc` and :paramref:`prune_automorphisms`.
Produces exact MCS mappings but can be expensive on large or highly symmetric graphs.
ApproxMCSMatcherHeuristic / greedy approximate MCS search.
Uses seed selection and local greedy growth instead of exhaustive enumeration.
Much faster on large graphs but only approximate – mappings are usually close to optimal in practice but not guaranteed to be globally maximal.
Any custom matcher can be plugged in as long as it implements the
MCSMatcherpublic API:__init__(node_attrs, node_defaults, edge_attrs, prune_wc, ...)find_rc_mapping()get_mappings()
Early-stop semantics#
The engine exposes two orthogonal control flags: :paramref:`early_stop` and :paramref:`fast_paths_only`.
If
early_stopisTrue:A cheap quick-check is attempted first via
_quick_check().If that fails, the engine looks for ITS graphs without any wildcard atoms in the forward and backward sets and post-processes them directly via
_early_stop_on_nonwildcard(), without any MCS/fusion.For each such candidate, a canonical reactant/product check is performed to ensure consistency with the original reaction:
forward candidates must preserve the original main product component;
backward candidates must preserve the original main reactant component.
Only if both these cheap paths fail, fusion and post-processing are run in a streaming loop: mappings are fused and post-processed one by one, and the pipeline stops after the first successful fused RSMI.
If
early_stopisFalse, the same loop runs without early exit, collecting all fused ITS and fused RSMIs.
Fast-path-only mode#
If
fast_paths_onlyisTrue(orprocess()is called withfast_paths_only=True):The engine never enters the expensive MCS/fusion stage (
_fuse_and_postprocess()is skipped).It only attempts:
_quick_check()_early_stop_on_nonwildcard()
If neither path yields a solution, the engine returns with empty :pyattr:`fused_its` / :pyattr:`fused_rsmis` and
result['mode'] == "fast_paths_only"andresult['reason'] == "fast_paths_no_solution".The flag
early_stopis ignored for the fusion stage in this mode, but still controls behaviour whenfast_paths_only=False.
Reactor / hydrogen control#
The underlying
SynReactoris configured via three flags that are exposed on the engine:implicit_temp– forwarded toSynReactor(..., implicit_temp=...).explicit_h– forwarded toSynReactor(..., explicit_h=...).embed_threshold– forwarded toSynReactor(..., embed_threshold=...).
This gives fine-grained external control over how templates are embedded and how hydrogens are handled during the reaction stage.
Parameters#
- param wildcard_element:
Element symbol used to denote wildcard atoms (default
"*", as in your wildcard framework).- type wildcard_element:
str, optional
- param element_key:
Node attribute key that stores the element symbol (default
"element").- type element_key:
str, optional
- param node_attrs:
Node attributes used by the matcher when comparing nodes. Defaults to
["element", "aromatic", "charge"].- type node_attrs:
Sequence[str] or None, optional
- param edge_attrs:
Edge attributes used by the matcher when comparing bonds. Defaults to
["order"].- type edge_attrs:
Sequence[str] or None, optional
- param prune_wc:
If
True, ask the matcher to prune wildcard nodes from both graphs before matching (when supported by the matcher class).- type prune_wc:
bool, optional
- param prune_automorphisms:
If
True, ask the matcher (for exampleMCSMatcherorApproxMCSMatcher) to prune automorphism-equivalent mappings, typically collapsing mappings that cover the same host-node set.- type prune_automorphisms:
bool, optional
- param mcs_side:
Side of the reaction centres to match when using
MCSMatcher.find_rc_mapping(). Typical values are"l","r"or"op".- type mcs_side:
str, optional
- param early_stop:
If
True, activate the multi-stage pruning described above and enable streaming early-stop inside the fusion loop.- type early_stop:
bool, optional
- param fast_paths_only:
If
True, only fast paths (quick-check and non-wildcard ITS early-stop) are used. The expensive fusion stage is skipped entirely, even ifearly_stopisTrue. This can be overridden per-call inprocess().- type fast_paths_only:
bool, optional
- param max_mappings_per_pair:
Hard cap on the number of mappings to consider for each (forward ITS, backward ITS) pair. Default is
1.- type max_mappings_per_pair:
int, optional
- param implicit_temp:
Flag forwarded to
SynReactor(implicit_tempargument). Controls whether the template is treated as implicit.- type implicit_temp:
bool, optional
- param explicit_h:
Flag forwarded to
SynReactor(explicit_hargument). Controls whether explicit hydrogens are kept during reaction application.- type explicit_h:
bool, optional
- param embed_threshold:
Hard cap forwarded to
SynReactor(embed_thresholdargument), typically controlling the maximum number of embeddings before the reactor aborts.- type embed_threshold:
int, optional
- param reactor_cls:
Class used to instantiate the reactor. Must be compatible with
SynReactorand expose anitsattribute and (optionally)smarts.- type reactor_cls:
type, optional
- param wildcard_adder_cls:
Class used to decorate reactions with radical wildcards. Defaults to
RadicalWildcardAdder.- type wildcard_adder_cls:
type, optional
- param matcher_cls:
Class used for ITS matching. By default this is
MCSMatcher(exact MCS). It can be replaced byApproxMCSMatcherfor a greedy, approximate search that is much faster but not guaranteed to be globally optimal.- type matcher_cls:
type[MCSMatcher] or type[ApproxMCSMatcher], optional
- param fuse_fn:
Function used to fuse ITS graphs based on a core mapping. Defaults to
fuse_its_graphs().- type fuse_fn:
Callable[[ITSLike, ITSLike, Dict[Any, Any]], ITSLike], optional
- param remove_explicit_H_fn:
Function that removes explicit hydrogens from a reaction SMILES. Defaults to
synkit.Chem.utils.remove_explicit_H_from_rsmi().- type remove_explicit_H_fn:
Callable[[str], str], optional
- param rsmi_to_its_fn:
Function to convert RSMI to ITS; defaults to
synkit.IO.rsmi_to_its().- type rsmi_to_its_fn:
Callable[…, ITSLike], optional
- param its_to_rsmi_fn:
Function to convert ITS to RSMI; defaults to
synkit.IO.its_to_rsmi().- type its_to_rsmi_fn:
Callable[[ITSLike], str], optional
- param h_to_implicit_fn:
Function to convert explicit hydrogens to implicit in an ITS or graph; defaults to
synkit.Graph.Hyrogen._misc.h_to_implicit().- type h_to_implicit_fn:
Callable[[ITSLike], ITSLike], optional
- param standardize_h_fn:
Function to perform final hydrogen standardization; defaults to
synkit.Graph.Hyrogen._misc.standardize_hydrogen().- type standardize_h_fn:
Callable[[ITSLike], ITSLike], optional
- param standardize_fn:
Function used by the quick-check and verification for reaction canonicalization. It should take a reaction string and return a canonicalized reaction string. Typical usage is
Standardize().fit. Defaults to a simple identity standardizer that strips whitespace.- type standardize_fn:
Callable[[str], str] or None, optional
- param logger:
Logger for debug information. If
None, a module-level logger is created.- type logger:
logging.Logger or None, optional
Examples#
Exact MCS back-end#
Use the default
MCSMatcherfor exact MCS fusion:from synkit.Synthesis.Reactor.rbl_engine import RBLEngine rxn = "CCO.CBr>>CCOBr" template = "CBr>>C[*]" # toy example engine = RBLEngine( early_stop=True, fast_paths_only=False, implicit_temp=True, explicit_h=False, embed_threshold=5000, ) engine = engine.process(rxn, template) print(engine.result["mode"]) print(engine.fused_rsmis)
Approximate MCS back-end#
Swap in
ApproxMCSMatcherto accelerate matching on large graphs while retaining the same RBL API:from synkit.Graph.Matcher.approx_mcs import ApproxMCSMatcher from synkit.Synthesis.Reactor.rbl_engine import RBLEngine rxn = "CC1=CC=CC=C1.OBr>>CC1=CC=CC=C1OBr" template = "OBr>>O[*]" engine = RBLEngine( matcher_cls=ApproxMCSMatcher, # use heuristic MCS early_stop=False, # collect all fused hits fast_paths_only=False, ) engine = engine.process(rxn, template) for fused in engine.fused_rsmis: print(fused)
- property backward_its: List[Any]#
ITS graphs obtained from the last backward (invert) application.
- Returns:
List of backward ITS graphs.
- Return type:
list[ITSLike]
- property diagnostics: Dict[str, List[Dict[str, Any]]]#
Electron diagnostics grouped by reactor stage.
- property forward_its: List[Any]#
ITS graphs obtained from the last forward application.
- Returns:
List of forward ITS graphs.
- Return type:
list[ITSLike]
- property fused_its: List[Any]#
Fused ITS graphs obtained after wildcard-based core matching.
- Returns:
List of fused ITS graphs.
- Return type:
list[ITSLike]
- help() str[source]#
Return a short textual description of the current engine state.
Useful for quick inspection in interactive sessions.
- Returns:
Multi-line human-readable summary string.
- Return type:
- prepare_template(template: str | Graph | Any) RBLEngine[source]#
Prepare a reaction template into a standardized ITS representation.
- process(rsmi: str, template: str | Graph | Any, *, replace_wc: bool = True, fast_paths_only: bool | None = None) RBLEngine[source]#
Run the full RBL pipeline on a reaction RSMI and a template.
Split the reaction into reactants/products via
'>>'.Optionally attempt a quick-check (
_quick_check()) if early-stop or fast-paths-only logic is active. On success, store the solution as the sole entry in :pyattr:`fused_rsmis`.Prepare the template via
prepare_template().Run forward and backward template application via
react().Optionally attempt
_early_stop_on_nonwildcard()to exploit ITS graphs that contain no wildcard atoms at all, with canonical reactant/product verification.If fast-path-only logic is active and no solution was found in steps 2–5, return without running fusion.
Otherwise, run
_fuse_and_postprocess()with streaming early-stop behaviour controlled byearly_stop.
When
fast_paths_only(argument or attribute) isTrue, only steps 1–6 are executed and the expensive fusion stage is skipped entirely.- Parameters:
rsmi (str) – Input reaction SMILES.
template (str | nx.Graph | ITSLike) – Template as reaction SMILES, graph or ITS-like.
replace_wc (bool) – If
True, replace wildcard atoms by hydrogen during final post-processing.fast_paths_only (bool or None) – Optional per-call override of the engine-level
fast_paths_onlyflag. IfNone, the attribute value is used.
- Returns:
The current engine instance.
- Return type:
- Raises:
ValueError – If the reaction string does not contain
'>>'or if template preparation fails.
- react(substrate: str | Any, pattern: Any | None = None, invert: bool = False) RBLEngine[source]#
Public wrapper around
_run_reaction()that updates engine state.If
patternisNone, the last prepared template (:pyattr:`template_its`) is used.Results are stored in :pyattr:`forward_its` (for
invert=False) or :pyattr:`backward_its` (forinvert=True).- Parameters:
substrate (str | ITSLike) – Substrate reaction string or ITS-like object.
pattern (ITSLike or None) – Optional template ITS; if
None, use :pyattr:`template_its`.invert (bool) – If
True, store results as backward ITS.
- Returns:
The current engine instance.
- Return type:
- Raises:
ValueError – If no template pattern was provided or prepared.
- replace_wildcard_with_H(G: Graph) Graph[source]#
Replace wildcard atoms in an ITS graph with hydrogen.
This updates node-level attributes:
node[element_key]typesGH(if present, element field only)neighborslists (string-based)
Edge structure and other attributes are not touched.
- Parameters:
G (nx.Graph) – ITS graph to modify in-place.
- Returns:
The same graph instance, for convenience.
- Return type:
nx.Graph
- property result: Dict[str, Any]#
Summary of the result from the last
process()call.The dictionary contains:
"fused_rsmis": list of final fused reaction strings."mode": high-level termination mode (e.g."quick_check","early_stop","full_pipeline","fast_paths_only")."reason": short explanation of how/why the pipeline finished."metadata": small auxiliary dictionary with extra details."n_forward_its": number of forward ITS graphs."n_backward_its": number of backward ITS graphs."n_fused_its": number of fused ITS graphs.
- class synkit.Synthesis.Reactor.rule_filter.RuleFilter(host_graph: Graph, rules_list: List[Any], invert: bool = False, engine: str = 'turbo', node_label: str | List[str] = ['element', 'charge'], edge_label: str | List[str] = 'order', distance_threshold: int = 5000, sing_max_path: int = 3)[source]#
Bases:
objectFilter a host graph by a list of transformation rules (patterns), keeping only those rules whose (decomposed) pattern appears as a subgraph in the host.
- Parameters:
host_graph (nx.Graph) – The host graph to search within (will be converted to explicit H).
rules_list (list) – A list of rule objects to filter against.
invert (bool) – If True, use the “modifier” component of each decomposition; otherwise use the normal part.
engine (str) – Matching engine to use: “turbo”, “sing”, “nx”, or “mod”.
node_label (str or list) – Node attribute(s) for TurboISO to match on.
edge_label (str or list) – Edge attribute(s) for TurboISO to match on.
distance_threshold (int) – Threshold to skip distance filtering in TurboISO.
sing_max_path (int) – Maximum path length for SING engine.
- Returns:
An instance with only the rules that matched.
- Return type:
- property host: Graph#
The explicit host graph.
- Returns:
The host graph used for matching.
- Return type:
nx.Graph
- property new_rules: List[Any]#
Subset of rules for which matches[i] is True.
- Returns:
Filtered list of matching rules.
- Return type:
- class synkit.Synthesis.Reactor.single_predictor.SinglePredictor[source]#
Bases:
objectA class designed for one-step chemical reaction predictions using transformation rules.
This class utilizes transformation rules to predict the outcomes of chemical reactions based on provided SMILES strings.
- class synkit.Synthesis.Reactor.strategy.Strategy(value)[source]#
-
Strategy for sub-graph matching/application:
ALL: classic VF2 on the whole graph
COMPONENT: component-aware only (no cross-CC backtracking)
BACKTRACK: component-aware with backtracking across CCs
PARTIAL: partial matching (mcs)
- ALL = 'all'#
- BACKTRACK = 'bt'#
- COMPONENT = 'comp'#
- PARTIAL = 'partial'#
- class synkit.Synthesis.Reactor.syn_reactor.SynReactor(substrate: str | Graph | SynGraph, template: str | Graph | SynRule, invert: bool = False, canonicaliser: GraphCanonicaliser | None = None, explicit_h: bool = True, implicit_temp: bool = False, strategy: Strategy | str = Strategy.ALL, partial: bool = False, template_format: Literal['typesGH', 'tuple'] = 'typesGH', electron_diagnostics: bool = False, embed_threshold: int | None = None, embed_pre_filter: bool = False, automorphism: bool = True)[source]#
Bases:
objectA hardened and typed re-write of the original SynReactor, preserving API compatibility while offering safer, faster, and cleaner behavior.
- Parameters:
substrate (Union[str, nx.Graph, SynGraph]) – The input reaction substrate, as a SMILES string, a raw NetworkX graph, or a SynGraph.
template (Union[str, nx.Graph, SynRule]) – Reaction template, provided as SMILES/SMARTS, a raw NetworkX graph, or a SynRule.
invert (bool) – Whether to invert the reaction (predict precursors). Defaults to False.
canonicaliser (Optional[GraphCanonicaliser]) – Optional canonicaliser for intermediate graphs. If None, a default GraphCanonicaliser is used.
explicit_h (bool) – If True, render all hydrogens explicitly in the reaction-center SMARTS. Defaults to True.
implicit_temp (bool) – If True, treat the input template as implicit-H (forces explicit_h=False). Defaults to False.
strategy (Strategy or str) – Matching strategy, one of Strategy.ALL, ‘comp’, or ‘bt’. Defaults to Strategy.ALL.
partial (bool) – If True, use a partial matching fallback. Defaults to False.
template_format (ITSFormat) – ITS representation used when
templateis a reaction string. Defaults to"typesGH"for compatibility.electron_diagnostics (bool) – If True, expose per-result electron-accounting diagnostics without changing generated products.
- Variables:
_graph (Optional[SynGraph]) – Cached SynGraph for the substrate.
_rule (Optional[SynRule]) – Cached SynRule for the template.
_mappings (Optional[List[MappingDict]]) – Cached list of subgraph-mapping dicts.
_its (Optional[List[nx.Graph]]) – Cached list of ITS graphs.
_smarts (Optional[List[str]]) – Cached list of SMARTS strings.
_flag_pattern_has_explicit_H (bool) – Internal flag indicating explicit-H constraints.
- canonicaliser: GraphCanonicaliser | None = None#
- property diagnostics: List[Dict[str, Any]]#
Return optional electron-accounting diagnostics for built ITS graphs.
- classmethod from_smiles(smiles: str, template: str | Graph | SynRule, *, invert: bool = False, canonicaliser: GraphCanonicaliser | None = None, explicit_h: bool = True, implicit_temp: bool = False, automorphism: bool = False, strategy: Strategy | str = Strategy.ALL, template_format: Literal['typesGH', 'tuple'] = 'typesGH', electron_diagnostics: bool = False) SynReactor[source]#
Alternate constructor: build a SynReactor directly from SMILES.
- Parameters:
smiles (str) – SMILES string for the substrate.
template (str or networkx.Graph or SynRule) – Reaction template (SMILES/SMARTS string, Graph, or SynRule).
invert (bool) – If True, perform backward prediction (target→precursors). Defaults to False (forward prediction).
canonicaliser (GraphCanonicaliser or None) – Optional GraphCanonicaliser to use for internal graphs.
explicit_h (bool) – If True, keep explicit hydrogens in the reaction center.
implicit_temp (bool) – If True, treat the template as implicit-H (forces explicit_h=False).
strategy (Strategy or str) – Matching strategy: ALL, ‘comp’, or ‘bt’. Defaults to ALL.
template_format (ITSFormat) – ITS representation used when
templateis a reaction string. Defaults to"typesGH".electron_diagnostics (bool) – If True, expose per-result electron diagnostics without changing products.
- Returns:
A new SynReactor instance.
- Return type:
- property graph: SynGraph#
Lazily wrap the substrate into a SynGraph.
- Returns:
The reaction substrate as a SynGraph.
- Return type:
- property its#
- property its_list: List[Graph]#
Build ITS graphs for each subgraph mapping.
- Returns:
A list of ITS (Internal Transition State) graphs.
- Return type:
- property mapping_count#
Number of mappings
- property mappings: List[Dict[Any, Any]]#
Return unique sub‑graph mappings, optionally pruned via automorphisms.
- property rule: SynRule#
Lazily wrap the template into a SynRule.
- Returns:
The reaction template as a SynRule.
- Return type:
- property smarts#
- property smiles_list#
- property substrate_smiles#
Multi-step search#
- class synkit.Synthesis.MSR.multi_steps.MultiSteps[source]#
Bases:
object- multi_step(original_rsmi: str, list_rule: List[str], order: List[int], cat: str | List[str]) List[str][source]#
Orchestrate a multi-step chemical reaction process using a set of rules and a starting reactant.
Parameters: - original_rsmi (str): Initial reactant SMILES string. - list_rule (List[str]): List of GML rules for the reactions. - order (List[int]): Order of application of the GML rules. - cat (Union[str, List[str]]): Catalysts or additional reagents to be added, can be a single string or a list of strings.
Returns: - List[str]: List of reaction SMILES strings with atom-atom mapping applied after all steps.
- class synkit.Synthesis.MSR.path_finder.PathFinder(reaction_rounds: List[Dict[str, List[str]]])[source]#
Bases:
object- search_paths(input_smiles: str, target_smiles: str, method: str = 'bfs', max_solutions: int | None = None, cheapest: bool = True) List[List[str]][source]#
Search for reaction pathways from the input molecule to the target molecule using a specified method, optionally limiting the number of solutions.
- Additionally, cheapest can be set to True or False:
If cheapest=True, BFS uses a visited set and A* prunes costlier routes (typical approach).
If cheapest=False, BFS does not track visited states (returns more solutions), and A* does not prune costlier routes (also returns more solutions). (May lead to duplicates or many solutions if cycles exist.)
Parameters: - input_smiles (str): SMILES of the starting molecule. - target_smiles (str): SMILES of the target molecule. - method (str, optional): ‘bfs’, ‘astar’, or ‘mc’. - iterations (int, optional): Number of MC iterations (if method=’mc’). - max_solutions (int, optional): If set, stop after finding this many solutions. - cheapest (bool, optional): Controls pruning.
Default True => standard BFS/A*; False => “unrestricted” BFS/A*.
Returns: - List[List[str]]: Each solution path is a list of reaction SMILES from start to target.
Metrics#
- synkit.Synthesis.Metrics._plot.plot_f2_scores_line(data, figsize=(8, 6), show_f2=True, show_legend=True)[source]#
Plots F2 scores across different radii using a line plot, showing the trend of F2 score changes, and annotated with optional F2 scores.
Parameters: - data (dict): Dictionary containing nested dictionaries with ‘F2_score’ and possibly other metrics. - figsize (tuple): Figure size for the plot, default is (8, 6). - show_f2 (bool): Whether to show F2 scores on the curve, default is True. - show_legend (bool): Whether to show the legend on the plot, default is True.
Example Data format: {‘radii_0’: {‘Novelty’: 96.44, ‘Coverage’: 93.98, ‘Recognition’: 3.55, ‘F2_score’: 0.15}, …}
- synkit.Synthesis.Metrics._plot.plot_recognition_coverage_curve(data, coverage_col='Coverage', recognition_col='Recognition', f2_score_col='F2_score', figsize=(8, 6), show_f2=True, show_legend=True)[source]#
Plots a Recognition-Coverage curve using provided data, including optional F2 scores annotated. Styled with Seaborn for enhanced visual appearance.
Parameters: - data (dict): Nested dictionary containing the data for each radii, formatted as shown in example. - coverage_col (str): Key name for the coverage data in the dictionary. - recognition_col (str): Key name for the recognition data in the dictionary. - f2_score_col (str): Key name for the F2 score data in the dictionary. - figsize (tuple): Figure size for the plot, default is (8, 6). - show_f2 (bool): Whether to show F2 scores on the curve, default is True.
Example Data format: {‘radii_0’: {‘Novelty’: 96.44, ‘Coverage’: 93.98, ‘Recognition’: 3.55, …}}