Dealing with large volumes of data occurs in a wide range of NMR applications, from process development to metabolomics. How to efficiently extract the information you need from these spectra is extremely important. In this post, I will explain how the BeautifulJASON python package can be used to rapidly integrate an entire directory of data and extract those integrals into a text file for further downstream processing.
Installing Beautiful JASON
In order to follow the steps in this post you will need a copy of JASON and install the BeautifulJASON python package. The latter can be installed using pip with the following command:
pip install beautifuljason
pip will take case of installing all the required dependencies.
Set up processing with Rules
To handle the processing we will use JASON’s rules feature. This works just like the rules in an email client, when incoming data matches a rule, in this case a processing rule, it is applied to the data. So the first steps are to create a rules library and processing rule within it:
1. Create a new rule library , for example, called “Batch process and integrate”
2. Open a dataset and set up the desired processing. Then, create a processing rule from this dataset
In this example, we wish to extract a set of integrals from our data. Again, we can use the power of JASON’s rules, this time creating an analysis rule:
3. Perform manual integration within the desired ranges and create an analysis rule with the “Custom Ranges” option selected
Just to check that we have done everything correctly, we should reopen the dataset and ensure that the desired processing and analysis are performed as we expect. Now, any dataset matching the rules will have the processing and analysis applied.
Automating the analysis with Beautiful JASON
We now want to automate the analysis of our data stored in a given directory. To do this, we utilise a script included in this download: jason_batch_convert_ext.py.
4. Run the command:
jason_batch_convert_ext <input_folder> <output_folder> --formats jjh5 --extensions jdf --rules "Batch process and integrate" --execute
<input_folder>
and <output_folder>
should be replaced with the locations of the data and where you wish the output to be placed.
The other options are as follows:
-
-
--formats
specifies what format the use to write out the data, in this case JASON’s .jjh5 file format --extensions
specifies the extensions used for the input files. In the case of data stored in folders rather than individual files, you can use the--patterns
option to describe the path to the data-
--execute
runs the script. If this option is not specified, a dry run occurs and information about what would be done is returned, but no files are changed
-
Use the --help
option for usage information on the script.
You should now have an output folder containing a series of JASON .jjh5 files which have all had the desired processing and analysis applied. The final step is to extract the integral information from these files.
5. Finally, run the script:
batch_extract_integrals <output_folder> output.csv
The first argument is the output folder from step 4, and the second argument is the name of the .csv file to create. There is the option to extract additional parameters from the .jjh5 files using the --parameters
flag. More information can be found using the --help
option.
The end result of this process should be a folder with processed .jjh5 files and a .csv file containing all the integrals from these data.
This is just a simple example of how BeautifulJASON can be used to help automate the processing and analysis of large volumes of NMR data. The scripts referenced in this post are included in the BeautifulJASON python package as ready to use tools and can be used as the basis of your own analysis! Ensure that the appropriate python ‘Scripts’ (on Windows) or ‘bin’ (on macOS) directory is in your path.
Happy automating!