Brainstorming on the brochure

As per Michael's request, we have started brainstorming on what products/services we are offering to make a BIOwulf marketing brochure. This is a summary of a discussion with Matt Carbone and Michael Stern.
Questions to be answered before getting started with the brochure:
1 - who are our customers? (and who is going to read the brochure?)
2 - what problems can we solve for them?
3 - how can we solve them?
4 - what is the revenue model?

Tentative answers:
1 - Our customers are Big Pharmas/agricultural companies and startups building bio-medical instrumentation. The audience of the brochure will be high level decision makers with some scientific background.

2 - We use Machine Learning to solve a number of problems for which there is imcomplete knowledge but data is available in the form of examples. The examples can be images, spectra, genotype, associated with a teaching signal, e.g. which example is normal and which one reveals a malignancy. After learning the examples, the Learning Machine can make predictions on new unseen examples. A typical example is medical diagnosis. Typically, the examples have a large number of input components (pixels in an image, peaks in a spectrum, expression coefficients in a DNA microarray). We weed out useless components and reveals relevant ones. We exhibit subsets of complementary components that provide best prediction power. This can be used, for instance, to design an economic diagnosis test with very few measurements required. Our methods squeeze more information out of the data than others, they extract information from components that are usually discarded as noise.

We can help big pharmas analyze their data and determine whether there exist components in the data with significant predictive power (e.g. interesting genes that explain a given phenotype, or that contribute most to a given phenotype but not to another one) and find singular examples of particular interest (e.g. borderline patients). We can help them merge information from various sources, theirs plus ours (gene, protein, textual data from medical abstracts) to refine the selection of the most relevant components. Our visualization tools do not just present results, the allow browsing throw alternative solutions and get back to the original data to understand the significance of the patterns dicovered. We can help medical instrumentation companies get more out of their sensors. We can co-develop with them general purpose software that will make their sensor more competitive by increasing the predictive power of the signal. We can custom design solutions to particular problems (e.g. the design a diagnosis test for a particular disease) by selecting the optimum conditions of utilization and training a predictor to obtain optimum performance. We can provide them with custom made tools to aid the design of such solutions for new problems (incorporating experiment design and the choice of optimum conditions). Further down the road (maybe never):
- We can offer regular (paying) schools to teach people about SVMs.
- We can provide consulting services.
- We can provide a library of tools available on-line by subscription.
People could also purchase of compiled code for embedding in applications.
- We can sell the drug target discovery platform

3 - We tightly team up with the customers to create dedicated/customized/tailored solution using SVMs and other analytic tools: one mathematician teamed up with one domain knowledge expert. We have a toolkit for internal use from which we design dedicated solutions. We interact with customers during development via the numerical lab (upload data to our server, download results). We analyze the data ourselves and provide results. We also let our partners do their own explorations. This allows us to refine our tools for their particular needs. We share the benefits of the results and let partners embbed software modules in their products. Our partner benefit from ML expertise and our exposure to a variety of problems through multiple partnership.

4 - We want to share the benefits of the use of our tools and discoveries made with our tools: - the discoveries made by our analysts on data provided during development (royalties can be charged on co-discoveries used in a product, e.g. a drug) - the discoveries made by our partners via the numerical lab (same) - the use of the numerical lab (easy to monitor, could be a subscription) - the purchase or use of our tools made by the customers of our partners (royalties on the sales or fee for use) We want to retain ML IP and software copyrights to analyse our own data and "resell" it to other customers. We can have a client server approach where we give away a client with little or no capability that is empowered by our server. If the client is bundled with the medical instruments, it is like Microsoft explorer that and MSM or Netscape and AOL. We can have free trials and get people hooked. The server model opens the door of many easy billing mechanisms. The challenge is the data privacy.