The File object. Instead of just a string, these are used to represent files anywhere, on the cluster or your local application. There are some subclasses or extensions of File that have additional meta information like Trajectory or Frame. The underlying base object of a File is called a Location.

All of these objects share the location property. A string that represents a location for a file in general.

f = File('system.pdb')

This representation is so far useless unless we specify where this file is located. It could be on the HPC somewhere or on the local computer. To do that we use prefixes

  1. {drive}://{relative_path} or
  2. {drive}:///{absolute_path} (for local files)

You can use the following prefixes

  • file:// points to files on your local machine.
  • worker:// specifies files on the current working directory of the executing node. Usually these are temprary files for a single execution.
  • shared:// specifies the root shared FS directory (e.g. NO_BACKUP/ on Allegro) Use this to import and export files that are already on the cluster.
  • staging:// a special scheduler-specific caching directory. Use this to relate to files that should be reused, but not stored long-time. A typical example is a PDB file. This is required by every simulation but an input file. You want to copy it once to the cluster and use it over and over.
  • sandbox:// this is a specia folder where all temporary worker directories are located. It also contains the session folders for RP.
  • project:// this folder contains all the project data for your current project and is the place where all the data should be stored for long-time storage

Later you might want to transfer a file from a project folder to the current working directory (whereever this will be) and you would specify locations in this way

project://models/my_model.json >> worker://input_model.json

We start with a first PDB file that is located on this machine at a relative path

pdb_file = File('file://../files/alanine/alanine.pdb')

File like any complex object in adaptivemd can have a .name attribute that makes them easier to find later. You can either set the .name property after creation, or use a little helper method .named() to get a one-liner. This function will set .name and return itself. = 'initial_pdb'

The .load() at the end is important. It causes the File object to load the content of the file and if you save the File object, the actual file is stored with it. This way it can simply be rewritten on the cluster or anywhere else.


Now you can access the content

print pdb_file.get_file()[:500]
REMARK   1 CREATED WITH MDTraj 1.8.0, 2016-12-22
CRYST1   26.063   26.063   26.063  90.00  90.00  90.00 P 1           1
MODEL        0
ATOM      1  H1  ACE A   1      -1.900   1.555  26.235  1.00  0.00          H
ATOM      2  CH3 ACE A   1      -1.101   2.011  25.651  1.00  0.00          C
ATOM      3  H2  ACE A   1      -0.850   2.954  26.137  1.00  0.00          H
ATOM      4  H3  ACE A   1      -1.365   2.132  24.600  1.00  0.00          H
ATOM      5  C   ACE A   1       0.182

There are a few other things that you can access from a file. There is a time when it was initiated (like any storable object).

print 'timestamp', pdb_file.__time__
print 'uuid', hex(pdb_file.__uuid__)
timestamp 1490777436
uuid 0x5eadd73145711e7a9d3000000000042L

Access the drive (prefix)


Get the path on the drive (see we have converted the relative path to an absolute)

print '...' + pdb_file.dirname[35:]

or the basename

print pdb_file.basename


Location(location) A representation of a path in adaptiveMD
File(location) Represents a file object at a specific location
Trajectory(location, frame, length[, engine]) Represents a trajectory File on the cluster
Frame(trajectory, index) Represents a frame of a trajectory
JSONFile(location) A special file which as assumed JSON readable content
DataDict(data) Delegate to the contained .data object