.. _quickstart:
`VAMDC `_ is a European collaborative effort to centralize access to
atomic and molecular research data. Data producers maintain their
scientific resources as "nodes" in the VAMDC network. Data consumers can then conveniently
query the network from an online portal, receiving collated information from the nodes having what
they need. Not only does this allow consumers a unified way to access data, since references are
stored with the data, it also creates a clear way for scientists to credit the original data producer.
This quickstart guide is aimed at you who are a data producer and are interested
in making your research available through VAMDC. This page is relevant to you if any
of the following statments apply:
* I have data that could complement or improve existing atomic/molecular data.
* I have data in some custom format and want to make it available as a VAMDC node.
The `full documentation `_
gives more details. Also don't be shy to send an email to support@vamdc.eu if you run into trouble.
I have complementary data
=========================
This is for you who don't want to set up and maintain your data in
a regular VAMDC node. Maybe you have calculated better partition functions or measured
wavelengths more accurately -- things that are best used in
conjunction with other databases. Maybe your data set is too small to
warrant a full node. Or you simply don't have the time to maintain
one.
The solution is to let your data be accessible through an existing VAMDC
database - one dealing with the type of data you have.
Go to the `VAMDC portal `_. Here you will
find a list of all database nodes in VAMDC along with
brief descriptions and contact information. Once you find a database
you think would benefit from your data, contact the maintainer via
the email given on that page. As the internal format of each database
varies, you need to agree with the maintainer just how your data
should be supplied.
I want to publish existing data as a VAMDC node
===============================================
This is for you who already maintain a body of data. Maybe it's in a
database. Maybe it's stored using some legacy or otherwise
non-standard storage/access system. Such systems are often highly
optimized for the hardware they were created on, but can be hard to
maintain, update and keep secure in the long run. Especially
when the original creators have moved on.
VAMDC:s open-source *NodeSoftware* package is downloadable via GIT using
`these `_
instructions. There are also tarballs to be found `here `_.
The default NodeSoftware is `Python `_ based and uses the `Django `_
framework and a few more dependencies outlined on the
`prerequisites `_ page. All documentation
referes to the Python version of the software.
The NodeSoftware contains all tools for setting up and running a
VAMDC node. It also offers import tools for converting existing data.
It supports several modern relational databases (*MySQL*,
*PostgreSQL* etc).
Once you have everything installed, here is how you
get it set up, in brief:
#. In the NodeSoftware directory, go to ``nodes/``. Copy and rename
the ExampleNode directory. This will hold your new node.
#. In your new node directory, edit ``settings.py``. This sets up your
database and other properties. See other nodes for more examples.
#. You now need to specify the database schema to match how you store
your data. You need to describe the tables as "models" using Django's
easy syntax.
* If you already store data in a relational database, you can let Django create the
database models automatically as described on the
`Django homepage `_.
* If your data is stored in some other form you need to define your database
scheme yourself. See examples in the ExampleNode.
The NodeSoftware can help you import legacy data from text files on
almost any format. The included import tool (in the ``imptools/``
directory) converts from such raw data into a format possible to directly import into a
modern database. The process is summarized below (in more detail in
the `imptool documentation `_).
#. Prepare your raw data as text files (they can be gzipped if very
large).
#. Describe the format of your text in a *mapping file*. This tells
the import tool how it should read your input data and how this maps to the
new database structure you are creating. You can find an example
mapping file in the ExampleNode.
#. Run the import tool on your mapping file. This will convert your
raw input data to intermediary text files exactly representing
how the data will be represented in your database.
#. Import the converted files into your database using the SQL command
suitable for your database (such as ``LOAD DATA INFILE`` for MySQL).
To test your new Node you can start it with Django's in-built
testserver (``manage.py runserver``). This will start your node locally
on port `8000` by default. You can then download the JAVA-based validation
tool from http://www.vamdc.org/software and try sending some queries.
Test that ``/availability`` and ``/capabilities``
work as they should. Remember to set up some sample queries in your
settings file (see examples in the file) - once you register with VAMDC these will be used to
test your node's status.
The Django test server should *never* be used for anything but
debugging. See the `documentation `_
for instructions on setting up a full-fledged webserver and proxy to serve your data.
I want to connect my existing node to VAMDC
============================================
This is for you who have an existing database and NodeSoftware set up
already. You now need to make sure VAMDC can talk to it correctly.
VAMDC exchanges data with nodes on a unified format. In one direction are sent queries
using well-defined keywords, in the other are results in a standardized XML format.
These VAMDC-keys may not match your actual database structure or naming
scheme at all. You thus need to prepare a "dictionary" that
properly maps incoming requests to queries to your database. Vice-versa,
you need a dictionary to convert your data back to the VAMDC's unified
format.
The VAMDC dictionaries are necessarily rather complex in order to
cover all possible data forms. They are more extensively explained on the
the `concepts `_, page
and in the `new node `_.
documentation. Thmere is also a `list of all VAMDC keywords `_.
The NodeSoftware contains many examples of creating your dictionaries.
The final step consists of registering your node with
the central VAMDC repository.
#. Go to the development VAMDC repository at
http://casx019-zone1.ast.cam.ac.uk:/registry/.
#. Choose to create a new Entry in the side bar. In the login
requester, enter user ``vamdc`` and password ``deploy-ws``.
#. Name your new entry (read the Help link first) and pick the registry type
as "catalogue service".
#. You will next be asked to fill in human-readable information about
your node. The most important parts are highlighted and there are
also help links to read. You should fill in as much information as you can,
but at least these:
* *Title* is used to identify your node
* *Contact details* should contain the email to the node maintainer
* *Description* is used in node listings and describe what type of
data users should expect to find in your node.
#. Find your registration in the registry interface (it will be visible while
filling in the node info earlier) and choose the *Edit* link.
#. From the *Edit* link, choose *Edit metadata ... via VOSI* and enter
the ``/capabilities`` URL of your node. Remember that you must have
set up some sample queries in your settings file as well.
#. Uploading/Saving completes the registration.
Once the node has been validated in the development registry it will
manually be transcribed to the `main registry `
where you can from then on manage it. Normal data consumers will henceforth be able to access
it from the `main VAMDC portal `_.
Welcome the VAMDC community!