Summary of the Workshop on Federated Neuroscientific Information Systems

                   10-12 July 1997
             Big Sky and Bozeman, Montana

Organized and summarized by Rogene Eichler West

Greetings!

I hope that you arrived home from Big Sky Montana safely and that you are as excited by the potential of our continued collaboration as I am. We covered quite a bit of information and made quite a few mutually beneficial deals. I hope to review some of that content in this email, and perhaps suggest a mechanisms for how we might continue this collaboration forward.

This email is rather long - not the sound byte that emails should (?!) be! I mention this to encourage you to be patient and work your way to the end - there is much good information summarizing where we were and where we want to go!

Lets start with a bit of business.

First of all, I would like to again thank Gwen Jacobs of the Center for Computational Biology at Montana State University for sponsoring the Saturday portion of the workshop. It was a nice prelude to the interactions surrounding the opening ceremonies for the Center later that day.

Secondly, $$$! - I would remind all of you to contact me regarding financial support for attending the workshop. Please indicate the following: 1) CNS attendee who extended their stay to attend the workshop, 2) NONCNS attendee who traveled exclusively for this workshop, 3) Persons who assisted with transportation of workshop attendees via a rental vehicle.

Next, I will overview the meeting.

Day 1

On Thursday, we participated in a workshop with CNS participants. We all introduced ourselves and provided a bit of background into our affiliations and our interest in databases, modeling, and experiments. Most participants were involved in databases at some level; several were additionally modelers; the minority were also involved with collecting experimental data. The format was quite informal - simple questions were posed and the participants explored answers in a free-form manner.

The discussion centered around: 1) Tools; 2) Storage Issues; and 3) Financial and People resources.

Tools

        Visualization
                How can we "Zoom" from molecules to networks using the same 
                        visualization tools? What does this mean in terms of
                        organizing the representation of the database information?

        Format Converters
                How can data from published papers be "digitized"? i.e. How can 
                        we take a graph from a paper and extract a time series?
                How can we convert from, for example, confocal data into formats
                        which can be used by a simulator?

        Parameter Search/Optimization
                How can we explore the combinatorical space of data/simulations
                        possibilities most effectively?
                What datamining tools are available?
                How can we "fill in the gaps", to make a good guess when information
                        is missing?
 
        Scale
                How can we analyze data to integrate between various scales. For
                        example, how can we move between EM images to single cell
                        LM images to MRI images?
                Can comparative registration help us with this task? How do we 
                        implement this across mutliple databases?
        
        Quantitative/Time Series Data
                Histograms?
                Reverse Reconstruction Tools?
                        
        Interfaces to Analysis Tools
                This should be easy to use. Not everyone enjoys hacking code to
                        make analysis happen! What can be online and intuitive?
        
        Standards
                It was generally agreed that if tools are already available, most
                        persons are likely to adapt that data format as a standard.
                However, it might be quite useful initially to agree on a common 
                        format.

Storage Issues

        Data Ownership vs. Tools Ownership in a distributed environment
        Access to Raw data vs processed data? [Can we afford to store it all?]
        Adaptivity of data types - ex. 1) cells change over time 2) new data
                types will likely be invented by technology. Can the dbase be
                flexible to this change?

Financial/People Resources

        Is the NLM a potential source of funding?
        How do we retain current expertise with the current job market?

Additional Issues

        Are there mechanisms by which the Society for Neuroscience or major
                journals can request that published data be deposited into
                an online repository? [Like sequence data is required, currently]

        Focus on the Science! The databases serve no purpose as storage devices -
                what is important, is how they assist in answering scientific
                questions!

Day 2

On Friday, the morning session contained talks by the individual centers.

The speakers were:

        Rogene M. Eichler West, Caltech http://www.bbb.caltech.edu
        Mark Nelson, UIUC               http://soma.npa.uiuc.edu/isnpa
        George Wilcox, UMN              http://brain.med.umn.edu/braincntr.html
        Mike Stiber, UC Berkley [MSU]
        Claus Hildegard, Newcastle/ICN
        Jack Glaser, Micro Bright Field

I will send out the summary of the talks next week.

The afternoon was filled with discussion on how to make a collaboration work and in particular, deals were made between many groups - as briefly reviewed below.

Day 3

Saturday was filled with two talks and several demos.

Jenny Forss described Caltech's database and model construction interface, known as the Modeler's Workspace. This work resulted from her Masters Thesis project which she defended this spring. She showed us the prototype for a tool that can select morphologies and channel prototypes from a database of such objects. These objects can then be combined into a model which can be simulated by GENESIS.

The prototype for Jenny's MW can be accessed at:

http://smaug-gw.caltech.edu:8081 She would appreciate any feedback on function and stability that you might have to offer.

Ihab Awad [Minnesota] discussed technological issues in database integration. He got us up to speed on the buzz words and presented an analysis of 'Multidatabase' technological solutions. He then introduced some distributed systems tools, comparing and contrasting CORBA with Java technologies. The main point of Ihab's talk was to introduce some potential solutions for helping all the databases "talk" to one another.

If you did not get a copy of Ihab's overhead notes, you can download a copy at http://pain.med.umn.edu/ihab/cns97

The demos included Caltech's Modeler's Workspace, ICN's web-based channel reference, Minnesota's NeuronViz single cell visualization tool, and CCB's suite of tools for analyzing cricket structure and function.

Next, let's discuss the deals we made and where we might go next. I tried to take good notes on these details, but I hope that you will forgive any errors I have made. It is important that we communicate well by voicing and correcting any misunderstandings. Just point out my mistake! I can promise you it was honestly made!

UIUC - Mark Nelson's group would like to map their time series data onto a 3D map of the electric fish. George Wilcox at Minnesota suggests that he might have some software tools currently that can assist. UIUC is also looking for some filters to convert various formats into TSDP format and analysis tools. Caltech is willing to contribute such tools, as they will be already be in development for the EA parameter search part of the '98 project. UIUC would also like some help developing a standardized format for time series storage. While no one volunteered for this task, perhaps this will be addressed as additional dbase specialists are brought on the team. In return, UIUC offered the use of their current analysis tools as well as disk space on their jukebox drive.

Emory - Dieter Jaeger is an experimentalist with a great deal of Purkinje cell time series data available. He would be willing to unload much of his data off disk onto the UIUC site in return for a version of Minnesota's NeuronViz which will compile on a PC running Linux.

Minnesota - George Wilcox's group is looking for some time series analysis tools and assistance with Dbase development of both their own DBMS and the ICN DBMS. Minnesota is currently contributing 3D brain mapping and single cell visualization tools, as well as access to the supercomputing facilities/cycles for the '98 demo. There are also matching funds potentially available to hire a technical person thru the Minnesota Supercomputer Institute. MN is also potentially collaborating with MicroBrightField to put cell morphologies online. Caltech and UIUC both have time series analysis tools available to contribute. Additional discussion may reveal a source of matching funds to hire a dbase expert.

MicroBrightField - Jack Glaser has offered to provide a web site/dbase containing single cell data. The development of this site is potentially in collaboration with Minnesota. Caltech will contribute data conversion tools such that the morphologies be sim-able.

UCB - Mike Stiber cautions us to think about designing software that will LAST and to consider doing what we can to change the SOCIOLOGY of the current academic/ development environment. Mike would like to contribute his expertise to this project, and may become more involved as his position becomes more stable. Mike is perhaps the leading expert in software design/databases within this collaborative group, so we wish him success and hope that he will be able to join us.

ICN - Ed Conley was unable to attend the conference, but asked Malcolm and Claus to represent him. So far as my understanding goes, Ed has all the content complete at this time [as indicated by the published volumes], but needs help finding dbase experts to turn the book series into an online resource. Minnesota has worked on this problem for a while and is still committed to helping the ICN get online, but lacks the personnel to make this happen.

Newcastle - I did not realize until the CNS poster session that Malcolm and Claus ALSO have a database. I had hoped to talk with them a bit more about potential collaboration on Saturday, but as these things go, we never got a chance to discuss this in more detail. What they are looking for is more quantitative information on cell densities and perhaps [?!] additional parameter search tools. This wil require a bit of email to sort it all out, but Caltech is willing to contribute some parameter search tools in return for access to the anatomical connectivity data. MicroBrightField has tools [Stereology?] that can estimate cell density. However, I can only mention the existance of such a tool and leave the two groups to discuss the details.

NPACI - Caltech and UCSD [Mark Ellisman] have collaborations with NPACI. At a recent meeting at the SDSC, Caltech volunteered to help define the metadata in the neuroscience thrust area. We would like to confer with all of you on this topic before the next NPACI meeting. We still need to find out what resources might be potentially available to the Federation through this collaboration. For example, might we have access to large scale data storage or database experts who can assist with technical implementation?

Montana/UCSD - With Gwen and Mark so busy with the CCB preparations, we were unable to get much feedback with respect to if and how Montana/UCSD might benefit from federating. I am hoping that we can get a bit of discussion going via email over the next few days.

As I mentioned before, we will be submitting a renewal grant for our Dbase project this 15 Sept. We would like to formalize part of our committment by establishing some partners as part of our proposal. We should begin discussions on how this might proceed. What funding possibilities exist for non-US groups such as Newcastle and ICN?

Is anyone interested in pursuing a SBIR grant with MicroBrightField?

What kind of influence can we have on journals and/or the Society for Neuroscience such that published data need be contributed to experimental dbase sites? [Like GenBank???]

What is the possibility of establishing a collaboration with a commercial database company [for example, but not neccesarily, Oracle] for assistance in getting some of our tech issues solved? They get a tax break and good PR for helping out nine world-wide dbase-federated universities.

I have received several emails (10!) since the CNS meeting from persons who are interested in accessing a database appropriate for modeling content. I am quite encouraged that we are headed in the right direction!

I will include everyone's emails below. I will try to initiate communications between groups of you this week, but I hope that you will all take the initiative to pursue contact with each other. Additionally, I will look into setting up an email alias here at Caltech, so that a single email address will forward messages to the entire group.

Participants:

Rogene Eichler West     Caltech          rogene@bbb.caltech.edu
Jim Bower               Caltech          jbower@bbb.caltech.edu
Jenny Forss             Caltech          jenny@bbb.caltech.edu
Dave Beeman             Caltech/UC       dbeeman@dogstar.colorado.edu
Dave Bilitch            Caltech          dhb@bbb.caltech.edu
Venkat Jagadish         Caltech          venkat@bbb.caltech.edu
Mark Nelson             UIUC             nelson@vernal.npa.uiuc.edu
Gwen Jacobs             Montana          gwen@wintermute.nervana.montana.edu
Kelli Hodge             Montana          kelli@wintermute.nervana.montana.edu
Mike Stiber             UC Berkley       stiber@cicada.berkeley.edu
George Wilcox           Minnesota        george@med.umn.edu
Chris Honda             Minnesota        honda@med.umn.edu
Ihab Awad               Minnesota        awad@cs.umn.edu
Claus Hildegard         Newcastle        cch@crunch.ncl.ac.uk 
Malcolm Young           Newcastle        Malcolm%flash@newcastle.ac.uk
Ed Conley               ICN              ecc@leicester.ac.uk
Mark Ellisman           UCSD             mark@alex.ucsd.edu
Jack Glaser             MicroBrightField jglaser@microbrightfield.com

Thanks again for such a successful workshop! Lets go make good things happen and gear up for the big demo in 1998!

Cheers,

-Rogene

Rogene M. Eichler West, Ph.D.
California Institute of Technology
Universitaire Instelling Antwerpen, Belgium