Scientists looking a large wall screen

Data Science Work That Matters


The Science of Technologies for End-to-End Enablement of Data (STEED) is leading national and international efforts to unlock the value of big data by providing the tools, science and talent for next-generation technologies and infrastructure.  STEED provides technologies and tools for bridging the time gap between the acquisition of data, on the one hand, and real-time and long-term strategic and tactical decision making by enabling the opening of new application domains through innovative methods in structuring and transforming data.  Recently, STEED has been working on a project called Storage Convergence for Map-Reduce.  The map-reduce programming model, best typified by Apache Hadoop, is becoming the de-facto parallel data processing paradigm because of its ability to scale and tolerate failures.  However, data in Hadoop can only be accessed from within a map-reduce program executing in the cluster.  The isolation of data is problematic because it restricts (immediate) use of the results of map-reduce by other programs and it removes the data from the normal life cycle operations (such as backup and snapshot).  STEED is working to enable Storage Convergence between Hadoop clusters and enterprise-level storage systems that provides map-reduce applications efficient access to data in-situ on enterprise storage.



  • PyData Carolinas conference is coming to RTP in September.  For more information, see DSI News.
  • Workshop on Distributed and Parallel Data Analysis coming to SAMSI in RTP in September.  For more information, see DSI News.


Upcoming Events

Sep 2
Sep 14
  • Sep

    Time: 12:00 am - 11:59 pm

    Location: IBM RTP Activity Center, 3039 East Cornwallis Road, Building 400, Research Triangle, NC 27709

Full Calendar