<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN" "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8"/>
    <title>Project-Team:ZENITH</title>
    <link rel="stylesheet" href="../static/css/raweb.css" type="text/css"/>
    <meta name="description" content="Overall Objectives - Overall Objectives"/>
    <meta name="dc.title" content="Overall Objectives - Overall Objectives"/>
    <meta name="dc.subject" content=""/>
    <meta name="dc.publisher" content="INRIA"/>
    <meta name="dc.date" content="(SCHEME=ISO8601) 2015-01"/>
    <meta name="dc.type" content="Report"/>
    <meta name="dc.language" content="(SCHEME=ISO639-1) en"/>
    <meta name="projet" content="ZENITH"/>
    <!-- Piwik -->
    <script type="text/javascript" src="/rapportsactivite/piwik.js"></script>
    <noscript><p><img src="//piwik.inria.fr/piwik.php?idsite=49" style="border:0;" alt="" /></p></noscript>
    <!-- End Piwik Code -->
  </head>
  <body>
    <div class="tdmdiv">
      <div class="logo">
        <a href="http://www.inria.fr">
          <img style="align:bottom; border:none" src="../static/img/icons/logo_INRIA-coul.jpg" alt="Inria"/>
        </a>
      </div>
      <div class="TdmEntry">
        <div class="tdmentete">
          <a href="uid0.html">Project-Team Zenith</a>
        </div>
        <span>
          <a href="uid1.html">Members</a>
        </span>
      </div>
      <div class="tdmActPage">
        <a href="./uid3.html">Overall Objectives</a>
      </div>
      <div class="TdmEntry">Research Program<ul><li><a href="uid5.html&#10;&#9;&#9;  ">Data Management</a></li><li><a href="uid6.html&#10;&#9;&#9;  ">Distributed Data Management</a></li><li><a href="uid7.html&#10;&#9;&#9;  ">Cloud Data Management</a></li><li><a href="uid13.html&#10;&#9;&#9;  ">Big Data</a></li><li><a href="uid17.html&#10;&#9;&#9;  ">Uncertain Data Management</a></li><li><a href="uid18.html&#10;&#9;&#9;  ">Big data Integration</a></li><li><a href="uid19.html&#10;&#9;&#9;  ">Data Mining</a></li><li><a href="uid23.html&#10;&#9;&#9;  ">Content-based Information Retrieval</a></li></ul></div>
      <div class="TdmEntry">Application Domains<ul><li><a href="uid28.html&#10;&#9;&#9;  ">Data-intensive Scientific Applications</a></li></ul></div>
      <div class="TdmEntry">
        <a href="./uid34.html">Highlights of the Year</a>
      </div>
      <div class="TdmEntry">New Software and Platforms<ul><li><a href="uid36.html&#10;&#9;&#9;  ">Hadoop_g5k</a></li><li><a href="uid37.html&#10;&#9;&#9;  ">LogMagnet</a></li><li><a href="uid38.html&#10;&#9;&#9;  ">MultiSite-Rec</a></li><li><a href="uid39.html&#10;&#9;&#9;  ">ThePlantGame: crowdsourced plants identification</a></li><li><a href="uid40.html&#10;&#9;&#9;  ">Pl@ntNet</a></li><li><a href="uid41.html&#10;&#9;&#9;  ">Snoop &amp; SnoopIm</a></li><li><a href="uid42.html&#10;&#9;&#9;  ">SciFloware</a></li><li><a href="uid43.html&#10;&#9;&#9;  ">CloudMdsQL Compiler</a></li><li><a href="uid44.html&#10;&#9;&#9;  ">Chiaroscuro</a></li><li><a href="uid45.html&#10;&#9;&#9;  ">FP-Hadoop</a></li></ul></div>
      <div class="TdmEntry">New Results<ul><li><a href="uid47.html&#10;&#9;&#9;  ">Big Data Integration</a></li><li><a href="uid51.html&#10;&#9;&#9;  ">Distributed Indexing and Searching</a></li><li><a href="uid53.html&#10;&#9;&#9;  ">Scientific Workflows</a></li><li><a href="uid58.html&#10;&#9;&#9;  ">Scalable Query Processing</a></li><li><a href="uid60.html&#10;&#9;&#9;  ">Data Stream Mining</a></li><li><a href="uid62.html&#10;&#9;&#9;  ">Scalable Data Analysis</a></li></ul></div>
      <div class="TdmEntry">Bilateral Contracts and Grants with Industry<ul><li><a href="uid72.html&#10;&#9;&#9;  ">Microsoft (2013-2017)</a></li><li><a href="uid73.html&#10;&#9;&#9;  ">Triton I-lab (2014-2016)</a></li></ul></div>
      <div class="TdmEntry">Partnerships and Cooperations<ul><li><a href="uid75.html&#10;&#9;&#9;  ">Regional Initiatives</a></li><li><a href="uid78.html&#10;&#9;&#9;  ">National Initiatives</a></li><li><a href="uid86.html&#10;&#9;&#9;  ">European Initiatives</a></li><li><a href="uid90.html&#10;&#9;&#9;  ">International Initiatives</a></li><li><a href="uid107.html&#10;&#9;&#9;  ">International Research Visitors</a></li></ul></div>
      <div class="TdmEntry">Dissemination<ul><li><a href="uid111.html&#10;&#9;&#9;  ">Scientific Animation</a></li><li><a href="uid175.html&#10;&#9;&#9;  ">Teaching - Supervision - Juries</a></li><li><a href="uid201.html&#10;&#9;&#9;  ">Popularization</a></li></ul></div>
      <div class="TdmEntry">
        <div>Bibliography</div>
      </div>
      <div class="TdmEntry">
        <ul>
          <li>
            <a id="tdmbibentmajor" href="bibliography.html">Major publications</a>
          </li>
          <li>
            <a id="tdmbibentyear" href="bibliography.html#year">Publications of the year</a>
          </li>
        </ul>
      </div>
    </div>
    <div id="main">
      <div class="mainentete">
        <div id="head_agauche">
          <small><a href="http://www.inria.fr">
	    
	    Inria
	  </a> | <a href="../index.html">
	    
	    Raweb 
	    2015</a> | <a href="http://www.inria.fr/en/teams/zenith">Presentation of the Project-Team ZENITH</a> | <a href="http://www-sop.inria.fr/teams/zenith/">ZENITH Web Site
	  </a></small>
        </div>
        <div id="head_adroite">
          <table class="qrcode">
            <tr>
              <td>
                <a href="zenith.xml">
                  <img style="align:bottom; border:none" alt="XML" src="../static/img/icons/xml_motif.png"/>
                </a>
              </td>
              <td>
                <a href="zenith.pdf">
                  <img style="align:bottom; border:none" alt="PDF" src="IMG/qrcode-zenith-pdf.png"/>
                </a>
              </td>
              <td>
                <a href="../zenith/zenith.epub">
                  <img style="align:bottom; border:none" alt="e-pub" src="IMG/qrcode-zenith-epub.png"/>
                </a>
              </td>
            </tr>
            <tr>
              <td/>
              <td>PDF
</td>
              <td>e-Pub
</td>
            </tr>
          </table>
        </div>
      </div>
      <!--FIN du corps du module-->
      <br/>
      <div class="bottomNavigation">
        <div class="tail_aucentre">
          <a href="./uid1.html" accesskey="P"><img style="align:bottom; border:none" alt="previous" src="../static/img/icons/previous_motif.jpg"/> Previous | </a>
          <a href="./uid0.html" accesskey="U"><img style="align:bottom; border:none" alt="up" src="../static/img/icons/up_motif.jpg"/>  Home</a>
          <a href="./uid5.html" accesskey="N"> | Next <img style="align:bottom; border:none" alt="next" src="../static/img/icons/next_motif.jpg"/></a>
        </div>
        <br/>
      </div>
      <div id="textepage">
        <!--DEBUT2 du corps du module-->
        <h2>Section: 
      Overall Objectives</h2>
        <h3 class="titre3">Overall Objectives</h3>
        <p>Modern science such as agronomy, bio-informatics, astronomy and
environmental science must deal with overwhelming amounts of
experimental data produced through empirical observation and
simulation. Such data
must be processed (cleaned, transformed, analyzed) in all kinds of
ways in order to draw new conclusions, prove scientific theories and
produce knowledge. However, constant progress in scientific
observational instruments (e.g. satellites, sensors, large hadron
collider) and simulation tools (that foster in silico experimentation,
as opposed to traditional in situ or in vivo experimentation) creates
a huge data overload. For example, climate modeling data are growing
so fast that they will lead to collections of hundreds of exabytes
(<span class="math"><math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mn>10</mn><mn>18</mn></msup></math></span> bytes) expected by 2020.</p>
        <p>Scientific data is also very complex, in particular because of
heterogeneous methods used for producing data, the uncertainty of
captured data, the inherently multi-scale nature (spatial scale,
temporal scale) of many sciences and the growing use of imaging
(e.g. satellite images), resulting in data with hundreds of
attributes, dimensions or descriptors. Processing and analyzing such
massive sets of complex scientific data is therefore a major challenge
since solutions must combine new data management techniques with
large-scale parallelism in cluster, grid or cloud environments.</p>
        <p>Furthermore, modern science research is a highly collaborative
process, involving scientists from different disciplines
(e.g. biologists, soil scientists, and geologists working on an
environmental project), in some cases from different organizations
distributed over different countries. Each discipline or
organization tends to produce and manage its own data, in specific
formats, with its own processes. Thus, integrating distributed data and
processes gets difficult as the amounts of heterogeneous data grow.</p>
        <p>Despite their variety, we can identify common features of scientific
data: big data; manipulated through complex, distributed
workflows; typically complex, e.g. multidimensional or graph-based;
with uncertainty in the data values, e.g., to reflect data capture or
observation; important metadata about experiments and their
provenance; and mostly append-only (with rare updates).</p>
        <p>Generic data management solutions (e.g. relational DBMS) which have
proved effective in many application domains (e.g. business
transactions) are not efficient for dealing with scientific data,
thereby forcing scientists to build ad-hoc solutions which are
labor-intensive and cannot scale. In particular, relational DBMSs have
been lately criticized for their “one size fits all”
approach. Although they have been able to integrate support for all
kinds
of data (e.g., multimedia objects, XML documents and new functions),
this has resulted in a loss of performance and flexibility for
applications with specific requirements because they provide
both “too much” and “too little”. Therefore, it has been argued
that more
specialized DBMS engines are needed. For instance, column-oriented
DBMSs, which store column data together rather than rows in
traditional row-oriented relational DBMSs, have been shown to perform
more than an order of magnitude better on decision-support workloads.
The “one size does not fit all” counter-argument generally applies
to cloud
data management as well. Cloud data can be very large, unstructured
(e.g. text-based)
or semi-structured, and typically append-only (with rare updates).
Though cloud users and application developers may be in high numbers, DBMS experts wouldn't.
Therefore, current cloud data management solutions have traded
consistency for scalability,
simplicity and flexibility. As alternative to relational DBMS (which use the standard SQL language), these solutions have been quoted as Not Only SQL (NoSQL) by the database research community.</p>
        <p>The three main challenges of scientific data management can be summarized by: (1) scale (big data, big applications); (2) complexity (uncertain, multi-scale data with lots of dimensions), (3) heterogeneity (in particular, data semantics heterogeneity).
The overall goal of Zenith is to address these challenges, by proposing innovative solutions with significant advantages in terms of scalability, functionality, ease of use, and performance. To produce generic results, these solutions are in terms of architectures, models and algorithms that can be implemented in terms of components or services in specific computing environments, e.g. grid, cloud. To maximize impact, a good balance between conceptual aspects (e.g. algorithms) and practical aspects (e.g. software development) is necessary. We design and validate our solutions by working closely with scientific application partners (CIRAD, INRA, IRD, etc.). To further validate our solutions and extend the scope of our results, we also want to foster industrial collaborations, even in non scientific applications, provided that they exhibit similar challenges.</p>
      </div>
      <!--FIN du corps du module-->
      <br/>
      <div class="bottomNavigation">
        <div class="tail_aucentre">
          <a href="./uid1.html" accesskey="P"><img style="align:bottom; border:none" alt="previous" src="../static/img/icons/previous_motif.jpg"/> Previous | </a>
          <a href="./uid0.html" accesskey="U"><img style="align:bottom; border:none" alt="up" src="../static/img/icons/up_motif.jpg"/>  Home</a>
          <a href="./uid5.html" accesskey="N"> | Next <img style="align:bottom; border:none" alt="next" src="../static/img/icons/next_motif.jpg"/></a>
        </div>
        <br/>
      </div>
    </div>
  </body>
</html>
