VAMDC is a European collaborative effort to centralize access to atomic and molecular research data. Data producers maintain their scientific resources as “nodes” in the VAMDC network. Data consumers can then conveniently query the network from an online portal, receiving collated information from the nodes having what they need. Not only does this allow consumers a unified way to access data, since references are stored with the data, it also creates a clear way for scientists to credit the original data producer.
This quickstart guide is aimed at you who are a data producer and are interested in making your research available through VAMDC. This page is relevant to you if any of the following statments apply:
This is for you who don’t want to set up and maintain your data in a regular VAMDC node. Maybe you have calculated better partition functions or measured wavelengths more accurately – things that are best used in conjunction with other databases. Maybe your data set is too small to warrant a full node. Or you simply don’t have the time to maintain one.
The solution is to let your data be accessible through an existing VAMDC database - one dealing with the type of data you have.
Go to the VAMDC portal. Here you will find a list of all database nodes in VAMDC along with brief descriptions and contact information. Once you find a database you think would benefit from your data, contact the maintainer via the email given on that page. As the internal format of each database varies, you need to agree with the maintainer just how your data should be supplied.
This is for you who already maintain a body of data. Maybe it’s in a database. Maybe it’s stored using some legacy or otherwise non-standard storage/access system. Such systems are often highly optimized for the hardware they were created on, but can be hard to maintain, update and keep secure in the long run. Especially when the original creators have moved on.
VAMDC:s open-source NodeSoftware package is downloadable via GIT using these instructions. There are also tarballs to be found here. The default NodeSoftware is Python based and uses the Django framework and a few more dependencies outlined on the prerequisites page. All documentation referes to the Python version of the software.
The NodeSoftware contains all tools for setting up and running a VAMDC node. It also offers import tools for converting existing data. It supports several modern relational databases (MySQL, PostgreSQL etc).
Once you have everything installed, here is how you get it set up, in brief:
- If you already store data in a relational database, you can let Django create the database models automatically as described on the Django homepage.
- If your data is stored in some other form you need to define your database scheme yourself. See examples in the ExampleNode.
The NodeSoftware can help you import legacy data from text files on almost any format. The included import tool (in the imptools/ directory) converts from such raw data into a format possible to directly import into a modern database. The process is summarized below (in more detail in the imptool documentation).
To test your new Node you can start it with Django’s in-built testserver (manage.py runserver). This will start your node locally on port 8000 by default. You can then download the JAVA-based validation tool from http://www.vamdc.org/software and try sending some queries.
Test that <URL>/availability and <URL>/capabilities work as they should. Remember to set up some sample queries in your settings file (see examples in the file) - once you register with VAMDC these will be used to test your node’s status.
The Django test server should never be used for anything but debugging. See the documentation for instructions on setting up a full-fledged webserver and proxy to serve your data.
This is for you who have an existing database and NodeSoftware set up already. You now need to make sure VAMDC can talk to it correctly.
VAMDC exchanges data with nodes on a unified format. In one direction are sent queries using well-defined keywords, in the other are results in a standardized XML format. These VAMDC-keys may not match your actual database structure or naming scheme at all. You thus need to prepare a “dictionary” that properly maps incoming requests to queries to your database. Vice-versa, you need a dictionary to convert your data back to the VAMDC’s unified format.
The VAMDC dictionaries are necessarily rather complex in order to cover all possible data forms. They are more extensively explained on the the concepts, page and in the new node. documentation. Thmere is also a list of all VAMDC keywords. The NodeSoftware contains many examples of creating your dictionaries.
The final step consists of registering your node with the central VAMDC repository.
- Title is used to identify your node
- Contact details should contain the email to the node maintainer
- Description is used in node listings and describe what type of data users should expect to find in your node.
Once the node has been validated in the development registry it will manually be transcribed to the main registry <http://registry.vamdc.eu/> where you can from then on manage it. Normal data consumers will henceforth be able to access it from the main VAMDC portal.
Welcome the VAMDC community!