12 June 2013

447. Multiuser ECCE

A recent comment lead me to make this post:
[..] Can you advise a correct way of setting up ECCE so that everyone can have an individual account and run on the same machine?

It's a valid question. I haven't had any need for it since I'm running my own cluster, which I currently don't share with anyone. I also manage a multi-user cluster overseas, but I am the only one using ECCE, and so I'm fine running the ECCE server here.

Anyway. I built ECCE 6.4 as shown e.g. here and here.

Below I install the ECCE server as the user verahill, and then install the client software as the users lindqvst and me. In the former case I configure the server using a copied config_from_server script, and in the latter case I use config_with_server. Both methods worked fine. All users are on the same machine, but by editing apps/siteconfig/DataServers you can quite easily configure the ECCE client to connect to a remote server.


Server:

hostname
helium
sudo adduser verahill sudo adduser lindqvst ./install_ecce.v6.4.csh
Main ECCE installation menu =========================== 1) Help on main menu options 2) Prerequisite software check 3) Full install 4) Full upgrade 5) Application software install 6) Application software upgrade 7) Server install 8) Server upgrade IMPORTANT: If you are uncertain about any aspect of installing or running ECCE at your site, please refer to the detailed ECCE Installation and Administration Guide at http://ecce.pnl.gov/docs/installation/2864B-Installation.pdf Hit at prompts to accept the default value in brackets. Selection: [1] 7 Host name: [helium] Server installation directory: [/home/verahill/ecce-v6.4/server] Enter the path to the ECCE application software directory that will use this server (even if this directory is not accessible from the current machine) or [return] if you have not installed application software yet or don't want to update it: ECCE v6.4 will be installed using the settings: Installation type: [server install] Host name: [helium] Server installation directory: [/home/verahill/ecce-v6.4/server] Are these choices correct (yes/no/quit)? [yes] ECCE installation succeeded. *************************************************************** !! You MUST perform the following steps in order to use ECCE !! -- Transfer the script: /home/verahill/ecce-v6.4/server/ecce-admin/config_from_server to the machine where the application software is installed. Run this script as the same user as the application software installation. (This step can be skipped if you have not installed the application software and the server name is specified during that install) -- Start the ECCE server as 'verahill' by running: /home/verahill/ecce-v6.4/server/ecce-admin/start_ecce_server ***************************************************************
cp /home/verahill/ecce-v6.4/server/ecce-admin/config_from_server ~ chmod ugo+r config_from_server


First client
su lindqvst
cd ~
cp /home/verahill/config_from_server .
cp /home/verahill/install_ecce.v6.4.csh .
./install_ecce.v6.4.csh
Main ECCE installation menu =========================== 1) Help on main menu options 2) Prerequisite software check 3) Full install 4) Full upgrade 5) Application software install 6) Application software upgrade 7) Server install 8) Server upgrade IMPORTANT: If you are uncertain about any aspect of installing or running ECCE at your site, please refer to the detailed ECCE Installation and Administration Guide at http://ecce.pnl.gov/docs/installation/2864B-Installation.pdf Hit at prompts to accept the default value in brackets. Selection: [1] 5 Host name: [helium] Server installation directory: [/home/lindqvst/ecce-v6.4/apps] ECCE installation succeeded. *************************************************************** !! You MUST perform the following steps in order to use ECCE !! -- Configure the application software to use the desired ECCE server by running the script: /home/lindqvst/ecce-v6.4/apps/scripts/config_with_server as the same user as the application software installation. (This step can be skipped if you prefer to copy over and run the config_from_server script created during the server installation in the ecce-admin directory) -- To register machines to run computational codes, please see the installation and compute resource registration manuals at http://ecce.pnl.gov/using/installguide.shtml -- Before running ECCE each user must source an environment setup script. For csh/tcsh users add this to ~/.cshrc: if ( -e /home/lindqvst/ecce-v6.4/apps/scripts/runtime_setup ) then source /home/lindqvst/ecce-v6.4/apps/scripts/runtime_setup endif For sh/bash users, add this to ~/.profile or ~/.bashrc: if [ -e /home/lindqvst/ecce-v6.4/apps/scripts/runtime_setup.sh ]; then . /home/lindqvst/ecce-v6.4/apps/scripts/runtime_setup.sh fi ***************************************************************
Next, configure:
./config_from_server
Enter the ECCE application software installation home directory: /home/lindqvst/ecce-v6.4/apps ECCE application software home directory is /home/lindqvst/ecce-v6.4/apps Is this correct? [yes] Adding data server URL for helium to siteconfig/DataServers Adding URL for online help to siteconfig/site_runtime Adding message server URL for helium to siteconfig/jndi.properti
And then do
echo 'export ECCE_HOME=/home/lindqvst/ecce-v6.4/apps' >> ~/.bashrc
echo 'export PATH=${ECCE_HOME}/scripts:${ECCE_HOME}/scripts/parsers:${PATH}' >> ~/.bashrc


Second client:
I set up a third user, installed ECCE as shown above for lindqvst, but instead of running config_from_server
I did:
/home/me/tmp/ecce-v6.4/ecce-v6.4/apps/scripts/config_with_server
Enter the full host name of the machine where the ECCE server is installed: helium ECCE server host name is helium Is this correct? [yes]
And then do
echo 'export ECCE_HOME=/home/me/ecce-v6.4/apps' >> ~/.bashrc
echo 'export PATH=${ECCE_HOME}/scripts:${ECCE_HOME}/scripts/parsers:${PATH}' >> ~/.bashrc
source ~/.bashrc


Tying it all together:

Start the server: As user verahill, start the server.
su verahill
/home/verahill/ecce-v6.4/server/ecce-admin/start_ecce_server
/home/verahill/ecce-v6.4/server/httpd/bin/apachectl start: httpd started [1] 13951 INFO BrokerService - ActiveMQ 5.1.0 JMS Message Broker (localhost) is starting INFO BrokerService - ActiveMQ JMS Message Broker (localhost, ID:helium-39358-1370944321599-0:0) started
ps a|grep 13951
13951 pts/3 S 0:00 grep -v -e ACTIVEMQ -e Loading -e AMQ -e Kaha -e help -e Transport
You should be able to autostart ecce using something along the lines of the following in your /etc/rc.local:
su verahill '/home/verahill/ecce-v6.4/server/ecce-admin/start_ecce_server' &
As a user, start ecce:
su lindqvst
/home/lindqvst/ecce-v6.4/apps/scripts/ecce
Create a password

Create a job
Then log out and log in as another user, e.g. me

su me
ecce
me is in reality called something else
Note that as user me you can't see any of the files in lindqvst's folder -- nor access them without inputting the correct (ECCE) username and password.

And again, note that there are alternative ways of setting this up -- you could have everyone log in as the same linux user, but still retain different ECCE identities. I think what I show here is more in line with how most people would want to use ECCE though.

Also, note that you can quite easily be a client on a different computer too -- the key lies in editing the apps/siteconfig/DataServers file.

18 comments:

  1. Thank you for your answer. I searched for a while couldn't realise how to configure the server.
    Our office currently have one spare computer fitted with a slightly higher quality graphic card. Therefore I was thinking to use it as a shared workstation running ECCE on Linux. From the instruction you give in this post I have the impression that the ECCE server is a computer with direct or high bandwidth connection with cluster or supercomputer. If that is the case, we need a higher-end workstation for this task. In the meantime I will setup the workstation as you suggested: login as one Linux user with different ECCE identities. It would be helpful if you can recommend a standard technical spec a ECCE server could be. When I use the viewer to check a system after optimisation task, the time I hit 'quit' on the viewer I notice the viewer tries to download the data from the cluster to the ECCE server. If the data is big, it takes very long time to close the viewer.

    The problem on virtual machine is that the image file tends to get very big along time; therefore, I keep the first created image for transferring to other people.

    ReplyDelete
    Replies
    1. Alvyn,
      So you're running all of this in a virtual machine? And each user log in to the same copy of the same virtual machine?

      1. ECCE is just a server -- it doesn't need anything high-end (although see this for info about memory: http://www.nwchem-sw.org/index.php/Special:AWCforum/st/id836/Help_with_JVM_heap_error.html). For example, I set up the example above on an old revived Dell Dimension C521 from 2007 which is worse than anything you'd find on the market today, short of a tablet.

      For my personal use I run it on my desktop, which is an old AMD Athlon II X3, with an Nvidia GT430 card, and 8 Gb RAM (not because of ECCE, but had to make a video of an MD simulation, so upgraded from 4 Gb).

      2. You'd (as your realise) be better off running it on bare metal instead of in a virtual machine. And if you do run normal-sized jobs you'll find that the virtual machine will quickly fill up. I don't know in what environment you work, but you may well find that there's a five-year old computer hiding in a corner which you'd be able to revive and dedicate to this stuff.

      3. If you do stick with a virtual machine I do think you're better doing what you're suggesting: one linux identity, and several ECCE ones.

      4. I think the original post that you used described ECCE 6.3, which is generally very slow. In particular, it opens the viewer very, very slowly. ECCE 6.4 is much snappier, so consider using it instead, although be aware that if you want to run it on a 32 bit OS you'll need to compile it yourself since only the 64 bit binaries are released.

      Let me know how it goes.


      Delete
    2. The story goes like this:

      Initially I use VM because I have only got one windows machine and a Mac. We do not have many spare machines to run Linux. Using VM allows me to test ECCE and nwchem while I am at work and home. As an addition, I am able to distribute the VM to colleagues. However, I noticed that the virtual machine fills up very quickly as you said thus I was seeking to use a Linux machine primary for ECCE tasks. Then, I equipped a NVidia Quadro 800 graphic card on one of our computer. I was unsure of how a cuda accelerated graphic card would assist our work therefore I only equipped one Linux machine for test initially. I was also wondering if users on the Linux machine can have individual account. Say, every Linux user can run ECCE while log into their own Linux account.
      From the options you described, I think I was doing it incorrectly by trying to let individual Linux account run ECCE client. Now I will try to create ECCE accounts for individual users instead. I might also try to configure the Linux machine as a ECCE server, so that my colleagues can run ECCE client from their VM.

      Another question from me is, in case if I run a ECCE client only on VM, would my VM fill up quickly too?

      Delete
    3. I have to reiterate that I'm not an expert on ECCE or nwchem, and I am not a developer of either, so take what I say with a grain of salt.

      The easier question first: if you only run the client on VM then it should not fill up, as all the data is managed and stored by the server (in the server/data/Ecce/users/ directory)

      Now to speculation:
      I think the original intention is that each ECCE user connects to a central server from individual workstations on which they do not necessarily have a linux user account.

      Personally, if I was setting up a single workstation with multiple ecce users as you describe above, I'd also give them individual linux user accounts and copies of ECCE. The reason has nothing to do with ECCE -- if there's an issue with one linux account it won't affect the other users. It also allows users to explore options without fear of causing issues for other users, which in turn is good for learning.

      So I think you're doing it right, with each user having both a linux account and an ecce account.

      At a minimum I'd separate the accounts that run the server and the client. The potential for disaster is smaller.

      There's one thing I haven't touched on though:
      I was playing around with connecting to the ecce server on my workstation from home and found that I couldn't do e.g. 'Reconnect Job Monitoring' in the client at work if I had launched the job via my client at home. Somehow it stored the hostname of the submitting computer. Not sure whether this is a bug or a feature.

      I'll see if I can look into that in the future. Currently I'm trying to write a script for adding nwchem basis sets to ECCE -- the basis set list hasn't been updated for quite some time...(e.g. no def2- sets)

      Delete
    4. Your posts have been really helpful for me. I wish I have more discussion with anyone who knows about ECCE and nwchem.

      Checking on the system while at home is not a big problem for me, as I use remote desktop to check on the computer in the office while I am at home.

      If I have more time I will try to dig into the code, but I have not investigated on how to setup the debug environment using an IDE.

      Delete
    5. No worries. I'm hardly an expert on ECCE, but I've always felt that one of several reasons why ECCE has remained obscure is that the documentation is so poor/out of date. The more people share their experiences online, the more people may adopt ECCE, which increases the potential developer base etc.

      As for the code, be aware that it's written in at least four different languages -- the interface is written in python, the parsing scripts are in perl, while other parts are in C++ and Java (afaik).

      Delete
    6. I found that when I open ECCE viewer to check on a dynamics simulation, no matter whether I read the trajectory or not, the viewer downloads lots of information just as if it is downloading the trajectory data back to the server. I guess that is the reason the virtual machine gets stuffed up quickly.

      Delete
    7. I'm guessing, but I think the data should go to /tmp/ecce-$USER (e.g. /tmp/ecce-alvyn) -- at least that's where data goes during run monitoring.

      I don't think you can get around the viewer downloading all the data on opening. Is all the trajectory data contained in the ecce output file?

      Assuming that stuff gets cache in /tmp, one solution is to find workable policies for cleaning /tmp (reboot vs every 24 hrs etc). Another -- better? -- would be to link your /tmp in the virtual machine to a folder on your physical machine.

      Guessing, as always. But those are the first few things I'd try. Checking whether it really goes to your /tmp would be the first step.

      On a different note:
      I haven't had much luck using ECCE to set up workable MD stuff, and judging from what I've seen online (and what I haven't) people have been struggling with that in general. If you have the expertise, it may be a good topic for a blog/blogpost from you (on your own blog, or here, whichever would make your life easier)? I'm sure there's an audience given that the main alternatives (NAMD/VMD and Gromacs) are also quite challenging to get started with.





      Delete
    8. I use gnome-system-monitor for the traffic monitoring. As I check via 'df -h' the /dev/sda1 Use% does get bigger and bigger after I tried to close the viewer of an MD simulation. I guess ECCE server tries to cache the data to /tmp/ecce_user/trj so that the next time it opens up the downloading process (will?) be faster.

      Restarting the machine does get the data in the tmp folder cleaned up.

      I am running a few MD simulations. Once I am done with them, I will post a summary of the procedures in my blog.

      Delete
    9. Hi Lindqvist

      I posted a feedback here: http://saccharides.blogspot.tw/2013/06/ecce-md-calculation.html

      Hope it help.

      Delete
    10. Cheers. The post is very well-written. I've thrown in a couple of links to it from related pages here.

      I've always had the impression that ECCE SHOULD be able to function as a full-featured MD GUI, but that the implementation was never fully finished. The documentation has needed a bit of an overhaul as well. I don't think there ever was much out there other than this video: http://ecce.emsl.pnl.gov/support/video_help/md_demo/md_demo.html

      Delete
  2. I have this video. (I will append it to my post) It reminds me that I don't really follow the centre->expand->solvate->envelope instructions. Part of the reason I don't do this is because some of my systems are huge, and I don't really want to expand too much of the size. I noticed that when I let the system to equilibrate, the box size shrinks. After adding sufficient amount of counter ions, the system actually grows to a substantial amount of size. I am not quite sure if this is necessary. Or perhaps one can make a smaller system by constraining the size of the box. I would really like to know how people actually do it.

    I can't do [Get Charge] whenever there is any amino acid, DNA, RNA, or whatever is not ordinary inside the system (like HIS -> HIE). That's why I always check the output file to get the charge after a pre-run.

    One thing I really want to ask is the render style. I have been messing up with the GUI for a while not being able to find a solution independently showing each kind of molecule with different kind of style. I know that it is far easier if we export the trajectory and view in VMD, but it will be handy if ECCE can do that too.

    ReplyDelete
    Replies
    1. I (perhaps mistakenly) think I've managed to show the solvent as wireframe and the solvate as ball-and-wireframe in the past.

      I'll have a look and see if I can figure it out again.

      Delete
    2. Nevermind. That's simply the default view. No, I haven't managed to show each type of molecule (other than solvent) in a different style.

      I've too been using VMD for that.

      Delete
  3. Hi Lindqvist,

    I found that using the default setting with constant pressure setting in NWChem simulating water box, the system shrinks after long run, e.g. >1ns.

    For a system initially have ~53000 water molecules the initial box size was ~1600 nm^3. After about ins the system shrinks to 1300 nm^3. I will try to use larger long/short range interaction and see what will happen.

    If I fix the size of the box (NVT), there appears to have empty bubble inside the box after a period.

    Have you got this problem before?

    ReplyDelete
  4. I guess I have to set 'reasonable' short range and long range interaction. In addition, since the short range interaction has to be larger than a certain range, there is a reasonable smallest box size depending on the short/long range parameters.

    ReplyDelete
    Replies
    1. Hi Alvyn,
      thanks for the updates. I've been busy (I'm teaching again in a few weeks), and so haven't had much time to look at this.

      Have you compared with gromacs? Just interested in whether the behaviour of nwchem is consistent with other packages. Although in this case it seems partly to do with the s/l interactions, but maybe also in how tightly packed the initial box is (corresponds to ice or a high/low pressure water box). But again, I find MD to be akin to magic sometimes, and I really have no expertise in this.

      Delete
  5. Hi Lindqvist,

    I guess the efficiency of the two systems cannot be compared like I expected. I have tried various long range cutoffs (1.2, 1.5, 1.8, 2.0, 2.2, 2.3, 2.4, 2.7, 3.0 nm) with spce water in 7.2nm sq. water box and see how the system volume changes. For long range cutoffs equal and smaller than 22 A, the system shrinks significantly after around 50ps. If the cutoff is larger than 2.2 or 2.4nm, the system shrinks about 1% and stay steady with small fluctuations. I am also trying pme with various grid size and fft core numbers.

    We have a student here working on testing gromacs in our system. The speed he has is much faster than in nwchem (i guess it is because the system has certain level of simplification), but does not seem to be faster than your build. I guess we all need to give out essential simulation parameters in order to have a sensible comparison. I will have the student demonstrate his calculation to me and feedback here.

    ReplyDelete