Software library to serve for faster chemical reaction processing


Big Data has become ubiquitous in recent years, and especially so in disciplines with heterogeneous and complex data patterns. This is particularly true for chemistry. In some ways, chemical compounds may be compared with synonyms in linguistics because one particular compound can be represented in various ways. To further complicate things, some of them don't even have a specific structure and only exist as an amalgamation of forms turning into each other. That's why it's important to know whether we are dealing with different compounds or with different representations of the same one.

Sometimes, databases also have errors arising from general unawareness of software features or just general inattentiveness. Special software is needed to detect and correct such errors.

In the case of organic chemistry, reactions are notoriously difficult to analyze. That's why reaction data in chemoinformatics is much less developed than information about single molecules.

Laboratory of Chemoinformatics and Molecular Modeling (Kazan Federal University) has been working on this problem since 2013. The efforts have so far been funded by the Government of Russia and Russian Science Foundation. The group includes from the University of Strasbourg, University of North Carolina, Moscow State University, Palacky University Olomouc, and Helmholtz Center in Munich.

Kazanites have learned to predict reaction characteristics, find optimal reaction conditions, detect and correct data errors. As a result, a unique database of reaction characteristics has arisen. Currently, it includes 3.5 million entries. KFU is the only Russian member of Reaxys R&D Collaboration, a collective working on chemical databases.

In this new project, titled CGRtools, KFU researchers solved a number of problems to better handle reaction information. The software library is significantly richer in functionality than all the existing tools. CGRtools supports molecules and reaction as objects being the only tool supporting CGRs. CGRtools treats chemical objects similarly to standard Python data types like integers, strings, etc. Every chemical object is hashable due to atom numbering canonicalization. The objects support transparent class inheritance which augments existing functionalities - methods and attributes - without breaking up existing ones.

Importantly, the library is in free access at


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.

Source Link

Articles You May Like

Scientists unveil Event Horizon Telescope’s first image of a galaxy’s supermassive black hole
Researchers develop viable, environmentally-friendly alternative to Styrofoam
Plan to Weaken Car Emissions Rules Could Reopen Key Climate Case
Understanding the Psychological Effects of Childhood Cancer
Elon Musk is about to spend half a billion dollars on these 3 epic projects