Conferences: Talking Open Science at OSCON, Desktop Summit and Chemical Databases Meeting

Over the last two months I have had one of my most hectic travel schedules ever. It started withOSCON, and a panel discussion about "Practicing Open Science". This one was a bit of a surprise, as Bill Hoffman was originally presenting with Will Schroeder and Brian Wylie, from Sandia National Laboratories. As Bill couldn't make it we decided to change the content of my section, and talk about the new open chemistry area that I have been working on for about four years now. Will went first, followed by me and a wrap up from Brian, with a nice flow between Kitware working on open science for over a decade, me growing a new area of open science (now at Kitware) and Brian giving a government perspective on open source and open science. The slides are below and on slideshare if you would like to take a look.

I thoroughly enjoyed OSCON, and would love to attend future events. The toughest thing was deciding which talks to attend as there were often multiple tracks with talks of interest to me. This was also by far the largest and most commercialized open source event I have attended so far, in the beautiful city of Portland, OR. I couldn't stick around for long after the conference as I was flying out to England on the following Tuesday, and on to Berlin, Germany Friday to attend the Desktop Summit. This was my first time in Germany, and I was looking forward to exploring Berlin a little, along with some time to catch up with a few family and friends in England before and after the conference. I talked about "Open Source Visualization of Scientific Data" on the final day of the main conference, and was very pleased to have a large and interested audience. Here I also discussed my work in open chemistry, along with a lot of the other work we do at Kitware in the Scientific Computing group.

I stayed for the remainder of the conference, attending my first KDE e.V. meeting, and was joined by Bill Hoffman towards the end of the week. Bill gave a workshop on using CMake, and I helped out with that, along with taking part in several BoF sessions and meetings. It was a very hectic week, very different feel to OSCON with a lot of great presentations, BoFs and hacking sessions. I also had the opportunity to meet up with Alexander Neundorf who was an intern at Kitware for half a year, and several other KDE developers interested in build systems, software process, testing, coverage and related areas.

Then I was back home for just over a week before braving the elements and heading straight for the path of hurricane Irene. I was invited to the 5th Meeting on U.S. Government Chemical Databases and Open Chemistry where I talked about "Chemical Databases and Open Chemistry on the Desktop". This meeting was very focused on chemical databases and the open chemistry I have been working on so hard for the last few years. It was a great experience to be able to see what others are working on, and discuss possible points for future collaboration. There is some amazing work happening in this area, and this meeting helped me gain greater clarity on how my work at Kitware can fit into the larger picture to significantly improve the landscape in open chemistry.

Thanks to Kitware for allowing me to attend, and funding my travel/other expenses, and to my wife and son for tolerating my long absences over the last couple of months. An even bigger thank you to my wife, Louise, for letting me off the hook on my first missed wedding anniversary so that I could present at OSCON! I had some great news about funding for the continued development of many of the ideas discussed in the slides, and so hope to have much more to talk about over the coming months (and years). This post is already pretty long, I hope to continue developing this work and promoting open science, especially in chemistry, materials science, physics and the bio areas. There are lots of other amazing people working in these areas too, and I feel like we are getting to a point where we can create real change to improve the outlook in scientific research.

Talking About Open Source Visualization of Scientific Data at the Desktop Summit

I have begun my journey to the Desktop Summit, making the flight over from the US to Manchester yesterday. A short stay in Sheffield, and catch up with family before heading out to get my flight to Berlin tomorrow. I will be talking about the work I have done both at and before joining Kitware with the title "Open Source Visualization of Scientific Data". I plan to talk about a range of work from my Google Summer of Code project on Kalzium back in 2007, through to some of the exciting work at Kitware in VTK, ParaView and Titan looking at the challenges of large data, remote visualization and how to integrate the web and smartphones/tablets into the scientific data visualization workflow.

Desktop Summit 2011

Bill Hoffman is also planning to attend, and we will be running a workshop introducing CMake on Thursday. This is my first Desktop Summit, although Bill and I have both attended previous aKademy and Camp KDE meetings. I should be in on time to attend the pre-registration event, and will not be leaving until Saturday. Looking forward to a great summit, catching up with some old friends and making some new ones. Now, I think I should try to get some sleep before my flight tomorrow!


Talking at OSCON 2011 about Open Science

I am currently on a plane bound for Portland, Oregon enjoying the in-plane wi-fi. Will Schroeder, Brian Wylie and I will be talking about "Practicing Open Science" on Friday in the government track. I am standing in for Bill Hoffman who unfortunately could not make it, and will be discussing the work I have been doing to grow open chemistry both at Kitware and outside of Kitware with many amazing collaborators scattered around the world. I am really excited to have the opportunity to talk at OSCON, and would be happy to meet up and discuss this work if you are at OSCON. Will and Brian are both very passionate about open science too, they will both give their unique perspectives on practicing open science. I will be there from this evening and don't fly out until early Saturday morning.

OSCON 2011

I am very much looking forward to OSCON, and the major difficulty I have had is choosing between the talks that are all happening at the same time. In some cases there are two or three I would like to see in any given slot. I am hoping to attend the KDE release party tomorrow too, please join us there if you would like to celebrate with us.

Blue Obelisk Award

At the recent ACS Spring meeting I attended the Blue Obelisk dinner, where I was honored to receive a Blue Obelisk award, pictured below, for my contributions to Open Data, Open Standards and Open Source. This is largely due to the work I have done on Avogadro, Open Babel and other open source chemistry tools.

Blue Obelisk award

This was one of the biggest dinners I have had the opportunity to attend, and I got to meet many of the people I have worked with (or used their work), along with several people I had not had the opportunity to work with yet, but hope to in the future. We presented the work we had been doing on Quixote project at the chemical information symposium on chemistry and the internet, after attending the first Quixote meeting the previous week (thank you to Hartree Centre for inviting me to speak there, and sponsoring the event).

These are exciting times, thank you very much to Peter Murray-Rust for presenting me with the award, and all of the support he has shown, along with his relentless passion for open science. I have only been a part of this for a few years, but Peter has been working on opening up chemistry for decades now.

Visualization Toolkit (VTK) in the Google Summer of Code

As I already mentioned on the Kitware blog, the Visualization Toolkit (VTK) has been accepted as a mentoring organization for Google Summer of Code this year. You can see the VTK entry in Melange, and browse through our project ideas. I have taken part in the Google Summer of Code program since 2007 (first as a student, and later as a mentor) as part of the KDE project. I still maintain close ties to KDE, and work on several related projects such as Avogadro, CMake and VTK. VTK has Qt integration, and ParaView builds on both VTK and Qt for the visualization of large scientific data sets.

If you are a student, and would like to work on an exciting open source project, processing and visualizing some of the largest scientific data sets in the world, take a look at the Visualization Toolkit. There are a wide range of ideas, and if you have an idea you think would fit then please feel free to discuss it with me. I will let you know if it would be a good fit, and whether we have available mentors for the proposed project. We have mentors available who are experts in visualization, large data, parallel algorithms and related technologies. The core of VTK is written in portable C++, with new changes being tested daily. Our API is automatically wrapped in Python, TCL and Java.

I am very excited about VTK's first year in the Google Summer of Code, this represents a unique way for students to get involved in a large, well tested open source project. We have started using Gerrit for code review, and you can view build and test results on many platforms for VTK both continuously and nightly. We have a well established software process which will serve you well in any project where software quality is important, with nearly 1400 unit and regression tests. This is a large, collaborative project with more than 100 contributors last year (as measured by Ohloh).

CMake External Projects: Building Project Dependencies

Historically projects have attempted to minimize their dependency list, and often bundle in small third party libraries in an attempt to make things easier for new developers/users to compile their code. In the Avogadro project we have bundled a few really small libraries, but on the whole have maintained a dependency list and tried to keep it smaller. As I work on new code, I see opportunities to break off bits of functionality, such as with OpenQube, but don't want to add yet another thing a new user or developer must download, compile and install somewhere.

Linux packagers, myself included, dislike the practice of bundling in libraries. It means that instead of patching one libxml2, we get to patch one plus the three or four in our tree that have been bundled (often with different version, some local patches). The problem is less pronounced on Linux where package managers are ubiquitous and we are able to provide a list of packages to install, but even there we might be developing against versions not yet in the main distribution repository. This is one of the reasons I have always favored rolling release distributions over the periodic.

CMake's external project module helps us to deal with this issue in quite an elegant fashion. Coupled with meta repositories to bring several source trees together, CMake is able to direct the build of several projects, passing locations between projects and expressing dependencies between the projects being built. This means that something like Open Babel can build zlib and libxml2 before building the main Open Babel library. External projects and CMake allow us to download the source, create the build trees and even direct the build of non-CMake based projects like libxml2.

I have a prototype of this that I just put up to build the core of Avogadro, its working name is Avogadro Squared as I was feeling geeky that day and had no good names. One thing you should note is that everything in there is an external project, and Avogadro is the last one to be built (it depends on all of the other projects). It requires minimal changes to the projects it contains, it uses git submodules for some of the source, and CMake's download and tar functionality for zlib and libxml2. I will be adding options to simply use system versions of the libraries it can build, but Linux distributions etc can continue using the Avogadro repository directly.

As a new developer or user I can checkout the meta repository, have git download the submodules and CMake download the source tarballs. I can then build the entire project, and then continue to work in the Avogadro subdirectory of the build tree after that. That build tree is almost identical to the one I would have ended up with had I not used the meta repository, except it points to the dependencies I just built. I can then use vim, and IDE or whatever I choose to work on the inner projects. This works across Linux, Mac and Windows to get new users and developers up and running very quickly while only loosely coupling the dependencies to the Avogadro project.

I have worked on other larger projects, such as Titan and ParaView that are using this approach to a greater or lesser extent. Titan can actually built Qt, Boost, VTK, protobuf, Trilinos and a host of other dependencies before building the Titan libraries and applications. I think Avogadro Squared is an example of just how minimal a meta repository can be, although I will be extending it with more dependencies it really is just a glue repository.

One Year at Kitware...Already?!?

It is hard to believe, but I have been at Kitware for just over a year now. How are things going? I would say very well...I am very pleased I made the move, and that Bill Hoffman pushed me into applying after meeting him at the first Camp KDE. Kitware is growing fast, we are always on the lookout for new talent and I am already starting to feel like an old timer with all of the new employees joining.

I had my first ever annual review, which went well. We received word in September that my first SBIR proposal had been accepted, and we are working on getting the contract in place for that. So watch this space - a great collaboration coming up working on open source chemistry visualization, editing, database integration, and computational chemistry input file generation along with analysis of the outputs. I think this is a great opportunity to extend VTK, and Avogadro.

I took a very active role in our migration to Git, and I am pleased to say that it has been going well. I also more recently got Gerrit up and running, introducing tightly integrated code review to some of our open source projects at Kitware. I played a large role in setting up one of our most complex build systems to date for Titan, where it can build Qt, Boost and VTK (among other dependencies) using CMake's external project features. I have also had the opportunity to work with some of the Boost developers, and am helping with their build system work.

I have mostly concentrated on 2D rendering in VTK, using OpenGL. I replaced the existing 2D charts in VTK and ParaView with new charts using a new 2D rendering abstraction. So we now have a selection of chart types, with interactivity, that can be used on both client and server side. More recently I have been going even lower level, and working on FreeType font rendering in VTK, and seeing what I can do to improve the capabilities there.

There is lots of other stuff, some of it I have talked about here, and other bits I will when I find time. It has been a great first year, and looks like it is shaping up to be an even better second year. I feel very lucky to be getting paid to work on open source, open science and I get to work on some very interesting problems that help real scientists. Going forward I hope to introduce more scientists to open source, open data, open standards and collaboration platforms. I am very privileged to have worked with so many forward looking scientists over the last few years, and am a proud unmember of the Blue Obelisk.

I think Kitware is the perfect place for me to push forward open source in science, and am refreshed that I rarely need to push anyone here in that direction. I have been driven to learn a lot of new things in the past year, and it has been tough at times, but I have thoroughly enjoyed it. There are some really amazing projects coming up in the next year - so watch this space!

Returning From Hibernation...

Wow, I just looked and I haven't written a thing since January! For those of you who might have been worried, or just wondered what I was up to...here is a quick run down. I am going to start with a little advice, combining starting a new family with moving from academia to industry and moving house it tough ;-) I have been really focused on work, home and one big conference, and kinda shut down otherwise.

I hope to remedy that in the coming months, and have started by doing some development for Avogadro and Open Babel. I also got Kalzium in KDE trunk ported to use the system Avogadro library, with some help from Pino Toscano. So KDE 4.5 will feature a Kalzium using the system installed Avogadro, this prompted a couple of bug fixes in Avogadro. So after that I tagged and released a much delayed Avogadro 1.0.1 with several bug fixes.

Way back in March Kitware was kind enough to send me out to the March ACS meeting, where I presented a talk on VTK, ParaView and its use in chemistry. I also gave a talk on Avogadro, and its use as a framework in chemistry visualization, which Geoff followed up with a talk on some applications of the Avogadro framework in his research.

The ACS conference deserves a full post of its own, but I feel like it has been so long I will just summarize a few of my thoughts. There were some other really interesting talks on visualization, and how it can be applied in chemistry. I got a general feeling that commercial software still has too much of a stranglehold, and hope to see that change as we develop powerful open source platforms that can be shared by all. There is a definite need for this in chemistry, and I am doing everything I can to seek some funding to further that cause, failing that I will continue to do what I can in my spare time.

I was honored to meet members of the Blue Obelisk for the first time. Saw some great talks about open science, open data, open standards and open access. I especially enjoyed meeting and seeing Peter Murray-Rust talk for the first time, I found that I share many of his ideals. I think we differ in some places, but life would be boring if that were not the case!

Our son, William, is nearly one year old already! He might be a big part of the reason why I have been inactive. The kinds of sleep deprivation torture you go through with children are indescribable :-P He is thankfully sleeping quite well now, and even took his first two steps yesterday.

We had our first visitors in our new home - friends from Pittsburgh and Washington DC all came up for a weekend. I fired up our new BBQ, an enormous American style with offset fire box. Made some amazing ribs, and shared some of the home brew I made - a portable porter, and an English brown ale (first two batches). We are just getting ready for a trip to Pittsburgh, and then William's first birthday (planning a small party at our place).

Then there is work, lots of exciting things are happening there. I taught my first course at Kitware, going through ParaView plugins. The new CMake book came out (I am one of the contributors to the new edition), and the new VTK book came out at around the same time. Kitware is hiring, so please let me know if you are interested in applying. We have some really interesting projects to work on, most of my time is spent on something called Titan. Last Friday I also pumped the tyres up on my bike, and rode into work for "Bike to Work Day".

I have skipped loads of stuff, but already wrote more than I intended. I will see if I can be a little more disciplined and write more frequently. My current problem is finding time to fit everything in, but I have a new strategy I am working on in order to do better. Life after the big 30 is certainly different. I feel energized again, and hope to be writing about more fun and interesting stuff I am doing over the coming months.

VTK: New 2D API, Canvas and Charting Features

Since joining Kitware in October, one of the first projects I was tasked with is revamping the 2D charting capabilities in VTK and ParaView. At first I was a little daunted as it meant digging through many of the internals of VTK, and breaking an assumption that is made in many parts of VTK - that everything being rendered is 3D.

A large portion of this work is also being driven by the InfoVis features in VTK, along with project Titan that we work on with some really interesting people from Sandia National Labs. The project grew quite a bit from its original scope, and I have now added some new 2D API that uses OpenGL as a backend, with the scope to add further backends in the future. I have been working on optimizing the OpenGL case so that large data sets can be rendered interactively, and small data sets can be rendered with minimal lines of code whilst giving pleasing visual results.

ParaView with 2D API canvas based VTK chart

Then when considering user interaction with these 2D elements we decided that a higher level API would be useful, that could contain objects and propagate mouse events to items in the scene. So I set about prototyping a new canvas based API. At this point I had enough new infrastructure that I felt it was about time I got back to my original task of implementing some efficient, well rendered 2D charts in VTK. Once I had my initial prototype in place it was time to expose this in ParaView and see how everything fitted together. As you can see in the screenshot above, things are shaping up very nicely. The new chart is in the bottom right widget, the chart above is the existing chart widget.

I have really enjoyed my first few months at Kitware, and have found my first project both challenging and rewarding. It is great to be working on real problems that have a broader impact, and as I flesh out these features I will try to maintain cross platform, high performance interactive charts. I think I have also added some useful new 2D focused API that can also be rendered over the top of VTK's existing 3D visualizations, opening the door to some very exciting new views on data.

As a physicist I also feel it is interesting the symmetry - Qt adds 3D to a 2D toolkit, and at the same time I am adding 2D to a 3D toolkit. Hope you all have a Merry Christmas, and a Happy New Year. I will be tracking Santa with my son this evening!

Disclaimer: The opinions and musings in this post are mine, and not those of my employer. Any mistakes/inaccuracies are also mine, that said I would love to hear what people think of this new work.

Avogadro 1.0.0 Released!

It is with great pleasure that I announce the release of Avogadro 1.0.0. After many years of work we have released what we consider to be a stable Avogadro release on Mole Day, which seems appropriate given the projects's name. There are still some rough edges, but I think this is a good release. With your help we can fix bugs in the release while working on new features in trunk.

Avogadro - Code Swarm from Marcus Hanwell on Vimeo.

What better time to look back to the beginnings of Avogadro. There was a blog post made today by Sourceforge about Avogadro detailing a little of that history. I have also made a code_swarm movie visualizing the history of the Avogadro project. There have been quite some changes in that time both at a project level and a personal level.

I would like to thank Google for sponsoring me for a GSoC project in the summer of 2007. Also Geoff Hutchison for giving me the opportunity to work with him at the University of Pittsburgh on interesting computational and visualization projects. Then there is my new employer, Kitware, who have provided me with an exciting opportunity to push scientific visualization and cross platform development to its limits.

To finish off a great day, my wife has informed me my new espresso machine has arrived! I am going to Camp KDE in January too!

Avogadro Auto Optimization Screencast

Geoff showed me a new screencast he created recently. It is made using the latest Avogadro, and is one of the first screencasts with our new and improved user interface. Geoff has also added some audio commentary with notes on the chemical relevance of the auto optimization tool. Check it out and let us know what you think - a new release of Avogadro is coming soon.

I will hopefully find the time to make a few new screencasts soon too. Between my one month old son, day job and waiting on my visa application (does not take any real time - some mental drain) I have not had much spare time to code or blog. Remember that Avogadro was nominated for the SourceForge community choice awards too - click on the link below to vote for us.

Avogadro Nominated for SourceForge Community Choice Awards

I am very pleased to announce that Avogadro has been nominated as a finalist in the SourceForge community choice awards this year. We are in the "Best Project for Academia" category, and I would like to encourage you to vote for Avogadro.

This is a real honour for all of us, and I appreciate all of you who nominated Avogadro. We are all pushing very hard on polishing Avogadro, getting ready for our 1.0 release. It would be absolutely amazing to see Avogadro win this award, so please vote for us.

Avogadro collage

There are also some other really nice projects in there too, such as Lancelot, ClamAV, phpMyAdmin and RepRap. So please take a few moments to place your vote, and tell your friends!

Update: You can vote even without a SourceForge account - just enter your email address and verify your vote.

Avogadro at the APS March Meeting and Q-Chem Workshop

So last week was extremely busy. The APS March Meeting was held in Pittsburgh and Q-Chem held a workshop on Q-Chem at the end of the week. I presented a poster on Avogadro (shown below), met lots of interesting people and got lots of new ideas for both research and Avogadro.

Avogadro poster

As we push towards making a 1.0 release of Avogadro, getting feedback from users in the scientific community is extremely important. As Q-Chem chose to use Avogadro as the builder/visualizer in their workshop I had the opportunity to observe new Avogadro users interact with our application for the first time. I also had the opportunity to help them overcome some initial issues and gained a few new insights.

I was very pleased to meet people at all stages of their career who were very interested in having an open source application that can provide a framework for building and visualizing molecules. I also realized that two of the most sought after features in Avogadro right now are the capability to easily make movies, and a z-matrix editor. People loved the ray-traced images of surfaces, coincidentally I received a request from someone in the press wanting to use an image I put up on my blog last year of ray-traced benzene molecules.

I look forward to hearing from some of the new users we gained in the last week. It is great to see Avogadro receiving more attention. I have started to work on the z-matrix editor and spent the weekend experimenting with movies - more to come soon!

Manifest Hell...The New DLL Hell?

I am going to shock you all and admit that I am not a Windows programmer! I do however know quite a bit about cross-platform development and have now learnt the joy that is the manifest in Windows development. It seems that DLL hell is now a thing of the past, all hail manifest hell. A week ago I was not aware of these wonderful little files and this post is an attempt to document what I learned while packaging Avogadro for Windows.

After listening to Bill's talk at Camp KDE I was convinced that it really was a good idea to use CPack to package Avogadro. So when I got back home I spent a day or two getting a Windows development environment set up for Avogadro. We have several dependencies such as Qt and this post documenting a bug in the Visual Studio 9 service pack. So I edited the manifest and lied about the version of the DLLs - it believed me.

Next I found that none of our plugins would load. They are Qt plugins that implement most of the functionality in Avogadro. I found a very long thread on this subject, the crux of which is that the embedded manifests are causing Windows to look for the runtime libraries in the plugin directory. Copying them did indeed work but was not optimal. I found that by adding set(CMAKE_MODULE_LINKER_FLAGS "${CMAKE_MODULE_LINKER_FLAGS} /MANIFEST:NO") to our CMakeLists.txt manifests were no longer made for our plugins (which are loadable modules).

So all is quite well with our Windows build now I think. If you would like to try out a new Windows installer, then please download it from here and let me know if it works for you. I have tested it on Windows 2000 and XP virtual machines. This is not the final installer, I need to add some extra data files for OpenBabel and ensure I got the other dependencies right.

If anyone with more Windows experience knows of better solutions please let me know. CPack is absolutely awesome, it seems a shame that the deployment of applications is made so difficult. I know that plugins are not as widely used and so hopefully this post will add to the collective knowledge indexed by Google.

Qt Going LGPL

Just to add that I think this is absolutely amazing news - Qt is going to add LGPL licensing as an option. For the few situations I have questioned using the Qt library it has mainly been an issue of license. I think this is huge news and now cannot see any downsides to using Qt when working on C++ GUI applications and libraries.

Thanks - I can't wait to thank some of you in person at Camp KDE!