Brainstorming on the brochure
As per Michael's request, we have started brainstorming on what
products/services we are offering to make a BIOwulf marketing brochure.
This is a summary of a discussion with Matt Carbone and Michael Stern.
Questions to be answered before getting started with the brochure:
1 - who are our customers? (and who is going to read the brochure?)
2 - what problems can we solve for them?
3 - how can we solve them?
4 - what is the revenue model?
Tentative answers:
1 - Our customers are Big Pharmas/agricultural companies and startups
building bio-medical instrumentation.
The audience of the brochure will be high level decision makers with
some scientific background.
2 - We use Machine Learning to solve a number of problems for which
there is imcomplete knowledge
but data is available in the form of examples.
The examples can be images, spectra, genotype, associated with a
teaching signal, e.g. which example is normal
and which one reveals a malignancy.
After learning the examples, the Learning Machine can make predictions
on new unseen examples. A typical example is medical diagnosis.
Typically, the examples have a large number of input components (pixels
in an image, peaks in a spectrum, expression coefficients in a DNA
microarray).
We weed out useless components and reveals relevant ones. We exhibit
subsets of complementary components that provide best prediction power.
This can be used, for instance, to design an economic diagnosis test
with very few measurements required.
Our methods squeeze more information out of the data than others, they
extract information from components that are usually discarded as noise.
We can help big pharmas analyze their data and determine whether there
exist components in the data with significant predictive power (e.g.
interesting genes that explain a
given phenotype, or that contribute most to a given phenotype but not to
another one) and find singular examples of particular interest (e.g.
borderline patients). We can help
them merge information from various sources, theirs plus ours (gene,
protein, textual data from medical abstracts) to refine the selection of
the most relevant components. Our
visualization tools do not just present results, the allow browsing
throw alternative solutions and get back to the original data to
understand the significance of the patterns
dicovered.
We can help medical instrumentation companies get more out of their
sensors. We can co-develop with them general purpose software that will
make their sensor more
competitive by increasing the predictive power of the signal. We can
custom design solutions to particular problems (e.g. the design a
diagnosis test for a particular disease) by
selecting the optimum conditions of utilization and training a predictor
to obtain optimum performance. We can provide them with custom made
tools to aid the design of such
solutions for new problems (incorporating experiment design and the
choice of optimum conditions).
Further down the road (maybe never):
- We can offer regular (paying) schools to teach people about SVMs.
- We can provide consulting services.
- We can provide a library of tools available on-line by subscription.
People could also purchase of compiled code for embedding in
applications.
- We can sell the drug target discovery platform
3 - We tightly team up with the customers to create
dedicated/customized/tailored solution using SVMs and other analytic
tools:
one mathematician teamed up with one domain knowledge expert.
We have a toolkit for internal use from which we design dedicated
solutions. We interact with customers during development via the
numerical lab (upload data to our server,
download results).
We analyze the data ourselves and provide results. We also let our
partners do their own explorations. This allows us to refine our tools
for their particular needs.
We share the benefits of the results and let partners embbed software
modules in their products.
Our partner benefit from ML expertise and our exposure to a variety of
problems through multiple partnership.
4 - We want to share the benefits of the use of our tools and
discoveries made with our tools:
- the discoveries made by our analysts on data provided during
development (royalties can be charged on co-discoveries used in a
product, e.g. a drug)
- the discoveries made by our partners via the numerical lab (same)
- the use of the numerical lab (easy to monitor, could be a
subscription)
- the purchase or use of our tools made by the customers of our partners
(royalties on the sales or fee for use)
We want to retain ML IP and software copyrights to analyse our own data
and "resell" it to other customers.
We can have a client server approach where we give away a client with
little or no capability that is empowered by our server.
If the client is bundled with the medical instruments, it is like
Microsoft explorer that and MSM or Netscape and AOL. We can
have free trials and get people hooked. The server model opens the door
of many easy billing mechanisms. The challenge is
the data privacy.