Ceph is a mature platform for software-defined storage environments scaling to dozens or hundreds of PetyBytes. However real life implementation, operations and maintenance are complex tasks. Ensuring the right compatibility of software and hardware, avoiding bottlenecks between storage nodes, keeping the complete stack running, exchanging end-of-life hardware and operating the complete stack efficiently may create substantial efforts and risks for IT. Playing around with petabyte-scale storage is no option and reliable service levels are key. Fujitsu presents an easy solution how to move from “Build your own disaster” to an enterprise class service level way of using Ceph based storage.
Oh no! You have a bug in your app, but you have no idea where it is. I’ll walk you through how we found and squashed a gnarly bug in socket.io using wireshark, chrome’s developer tools, lots of logging, and pretty graphs. I’ll also show you some good tips and tricks for tracking down and squashing bugs of your own.
Ben Golub and Solomon Hykes speech giving a thank you speech for the 1 year of Docker at the Docker HQ.
Don’t tell your boss, but I want you to make a useless art project–because it’s actually pretty useful. Why? Committing to uselessness is a freeing experiment. As professionals, we tend to focus on the end result instead of the process, and that’s not healthy. Embrace the creative process (iteration and experimentation) on a project and see where the path takes you.
“Inspiration is for amateurs. The rest of us just show up and get to work” - Chuck Close
In the decade between 1999 and 2008, more newly-approved, first-in-class drugs were found by phenotypic screens than by molecular target-based approaches. This is despite far more resources being invested in the latter, and highlights the rising importance of screens in biomedical research. (Swinney and Anthony, Nat Rev Drug Discov, 2011)
Despite this success, the data from phenotypic screens is vastly underutilized. A typical analysis takes millions of images, obtained at a cost of, say, $250,000, and reduces each to a single number, a quantification of the phenotype of interest. The images are then ranked by that value and the top-ranked images are flagged for further investigation. (Zanella et al, Trends Biotech, 2010)
The images, however, contain a lot more information than just a single phenotypic number. For one, usually only the mean phenotype of all the cells in the image is reported, with no information about variability, even though the distribution of cell shapes in a single image is highly informative (Yin et al, Nat Cell Biol, 2013). Additionally, cells display a variety of off-target phenotypes, independently of the target, that can provide biological insight and new research avenues.
We are developing an unsupervised clustering pipeline, tentatively named high-content-screen unsupervised sample clustering (HUSC), that leverages the scientific Python stack, particularly scipy.stats, pandas, scikit-image, and scikit-learn, to summarize images with feature vectors, cluster them, and infer the functions of genes corresponding to each cluster. The library includes functions for preprocessing images, computing an array of features designed specifically for microscopy images, and accessing a MongoDB database containing sample data. Its API allows easy extensibility by placing screen-specific functions under the screens sub-package. An example IPython notebook with a preliminary analysis can be found here.
We plan to use this library to develop a flexible web interface for flexible and extensible analysis of high-content screens, and relish the opportunity to enlist the help and expertise of the SciPy crowd.