Synthesis#

The synkit.Synthesis package provides a unified interface for reaction prediction and chemical reaction network (CRN) exploration. It applies rule-based graph rewriting to molecular structures, allowing you to enumerate candidate products (forward mode) or candidate precursors (backward mode) from reaction templates.

Reaction Prediction: Reactor#

The synkit.Synthesis.Reactor submodule applies a reaction template (SMARTS / rule) to an input substrate (SMILES) and enumerates all valid transformations under a chosen graph-matching strategy.

Two interchangeable backends are available:

  • NetworkX-based reactor SynReactor (lightweight, pure-Python workflow and tight integration with synkit graphs)

  • MØD-based reactor MODReactor [3] (graph-grammar engine backend, suitable for robust rewriting and larger workloads)

Reactor parameters#

Name

Type

Default

Description

invert

bool

False

Direction of application. Use False for forward prediction (substrate → products) and True for backward prediction (target → precursors).

explicit_h

bool

False

When True, hydrogens in the reaction center are rendered explicitly in the output SMARTS. This is useful for debugging, auditing rule scope, and disambiguating closely related matches.

strategy

str

'bt'

Graph-matching strategy used to enumerate transformations:

  • 'comp': component-aware matching (fastest; recommended for multi-component SMILES)

  • 'all': exhaustive arbitrary subgraph search (most expensive)

  • 'bt': fallback strategy (tries comp first, then all if no match is found)

template_format

str

'typesGH'

ITS representation used when the template is a reaction string. Use 'tuple' for the Lewis State Graph representation.

electron_diagnostics

bool

False

When True, keep Lewis-state accounting diagnostics on generated ITS objects. This is useful when inspecting charge, lone-pair, or radical recomputation. The option name remains electron_diagnostics for API compatibility.

automorphism

bool

True

Deduplicate symmetry-equivalent matches before rewriting.

Example: Forward Prediction (NetworkX)#

Forward prediction with explicit H and backtracking strategy#
 1from synkit.Synthesis.Reactor.syn_reactor import SynReactor
 2
 3input_fw = 'CC=O.CC=O'
 4template = '[C:2]=[O:3].[C:4]([H:7])[H:8]>>[C:2]=[C:4].[O:3]([H:7])[H:8]'
 5
 6reactor = SynReactor(
 7    substrate=input_fw,
 8    template=template,
 9    invert=False,
10    explicit_h=True,
11    strategy='bt'
12)
13
14smarts_list = reactor.smarts_list
15print(smarts_list)

Example output

[
  '[CH3:1][CH:2]=[O:3].[CH:4]([CH:5]=[O:6])([H:7])[H:8]>>[CH3:1][CH:2]=[CH:4][CH:5]=[O:6].[O:3]([H:7])[H:8]',
  '[CH3:4][CH:5]=[O:6].[CH:1]([CH:2]=[O:3])([H:7])[H:8]>>[CH:1]([CH:2]=[O:3])=[CH:5][CH3:4].[O:6]([H:7])[H:8]'
]

Example: Backward Prediction (NetworkX)#

Backward prediction targeting product to precursors#
 1from synkit.Synthesis.Reactor.syn_reactor import SynReactor
 2
 3target = 'CC=CC=O.O'
 4template = '[C:2]=[O:3].[C:4]([H:7])[H:8]>>[C:2]=[C:4].[O:3]([H:7])[H:8]'
 5
 6reactor_bw = SynReactor(
 7    substrate=target,
 8    template=template,
 9    invert=True,
10    explicit_h=False,
11    strategy='comp'
12)
13
14precursors = reactor_bw.smarts_list
15print(precursors)

Example output

[
  '[CH3:1][CH:2]=[O:6].[CH3:3][CH:4]=[O:5]>>[CH3:1][CH:2]=[CH:3][CH:4]=[O:5].[OH2:6]',
  '[CH3:1][CH3:2].[CH:3]([CH:4]=[O:5])=[O:6]>>[CH3:1][CH:2]=[CH:3][CH:4]=[O:5].[OH2:6]'
]

Example: Implicit-H Template (NetworkX)#

If your template is written in an implicit-H form, enable it via implicit_temp=True while keeping explicit_h=False.

Backward prediction with an implicit-H template#
 1from synkit.Synthesis.Reactor.syn_reactor import SynReactor
 2
 3target = 'CC=CC=O.O'
 4template = '[C:2]=[O:3].[CH2:4]>>[C:2]=[C:4].[OH2:3]'
 5
 6reactor_imp = SynReactor(
 7    substrate=target,
 8    template=template,
 9    invert=True,
10    explicit_h=False,
11    strategy='comp',
12    implicit_temp=True
13)
14
15precursors = reactor_imp.smarts_list
16print(precursors)

Example output

[
  '[CH3:1][CH:2]=[O:6].[CH3:3][CH:4]=[O:5]>>[CH3:1][CH:2]=[CH:3][CH:4]=[O:5].[OH2:6]',
  '[CH3:1][CH3:2].[CH:3]([CH:4]=[O:5])=[O:6]>>[CH3:1][CH:2]=[CH:3][CH:4]=[O:5].[OH2:6]'
]

Lewis State Graph Templates#

The NetworkX reactor can consume Lewis State Graph (LSG) templates. This is the SynKit-native path for transformations where valence-state information matters: lone pairs, radicals, valence electrons, and sigma/pi bond components are stored in the template and used during matching/rewrite. In the current API LSG construction is requested with format="tuple".

There are two common entry points:

Build the LSG template explicitly#
 1from synkit.IO import rsmi_to_its
 2from synkit.Synthesis.Reactor.syn_reactor import SynReactor
 3
 4smart = "[NH3:1].[CH3:2][Cl:3]>>[NH3+:1][CH3:2].[Cl-:3]"
 5substrate = "CCl.N"
 6template = rsmi_to_its(smart, core=False, format="tuple")
 7
 8reactor = SynReactor(
 9    substrate=substrate,
10    template=template,
11    implicit_temp=True,
12    explicit_h=False,
13    electron_diagnostics=True,
14)
15
16print(reactor.smarts)
Let SynReactor build an LSG template from a reaction string#
1reactor = SynReactor(
2    substrate="CCl.N",
3    template="[NH3:1].[CH3:2][Cl:3]>>[NH3+:1][CH3:2].[Cl-:3]",
4    template_format="tuple",
5    implicit_temp=True,
6    explicit_h=False,
7    electron_diagnostics=True,
8)

LSG rewrite policy:

Concept

Policy

Bond truth

sigma_order and pi_order are authoritative in new mode.

Product reconstruction

kekule_order is computed from sigma_order + pi_order before conversion through RDKit.

Charge

Charge is recomputed from valence electrons, lone pairs, hydrogen count, radical count, and Kekule bond-order sum.

Aromaticity

Aromatic flags are still useful for matching and display, but aromatic order=1.5 is not used as the LSG-authoritative rewrite value.

Note

LSG rewriting is currently a SynKit SynReactor path. MØD-backed reactors remain on the legacy rule representation.

Example: Forward Prediction (MØD)#

Forward prediction using the MØD backend#
 1from synkit.Synthesis.Reactor.mod_reactor import MODReactor
 2
 3input_fw = 'CC=O.CC=O'
 4template = '[C:2]=[O:3].[C:4]([H:7])[H:8]>>[C:2]=[C:4].[O:3]([H:7])[H:8]'
 5
 6reactor_mod = MODReactor(
 7    substrate=input_fw,
 8    rule_file=template,
 9    invert=False,
10    strategy='bt'
11)
12
13reaction_list = reactor_mod.reaction_smiles
14print(reaction_list)

Example output

['CC=O.CC=O>>CC=CC=O.O']

Example: Backward Prediction with AAM (MØD)#

When atom mapping must be retained end-to-end, use the AAM-aware variant (e.g., MODAAM) together with a GML rule representation.

Backward prediction with atom-map preservation#
 1from synkit.Synthesis.Reactor.mod_aam import MODAAM
 2from synkit.IO import smart_to_gml
 3
 4input_bw = 'CC=CC=O.O'
 5rule_gml = smart_to_gml(
 6    '[C:2]=[O:3].[C:4]([H:7])[H:8]>>[C:2]=[C:4].[O:3]([H:7])[H:8]',
 7    core=True
 8)
 9
10reactor_aam = MODAAM(
11    substrate=input_bw,
12    rule_file=rule_gml,
13    invert=True,
14    strategy='bt'
15)
16
17smarts_list = reactor_aam.get_smarts()
18print(smarts_list)

Example output

[
  '[CH3:1][CH:2]=[O:3].[CH:4]([CH:5]=[O:6])([H:7])[H:8]>>[CH3:1][CH:2]=[CH:4][CH:5]=[O:6].[O:3]([H:7])[H:8]',
  '[CH3:1][CH:2]([H:3])[H:4].[CH:5]([CH:6]=[O:7])=[O:8]>>[CH3:1][CH:2]=[CH:5][CH:6]=[O:7].[H:3][O:8][H:4]'
]

See Also#

  • synkit.IO — format conversion utilities (SMILES/SMARTS/GML and related helpers)

  • synkit.Graph — graph data structures, matching, and transformations