The Pittsburgh Supercomputing Center Presents Sherlock, a YarcData uRiKa System for Unlocking the Secrets of Big Data

Wed Nov 7, 2012 7:00am EST

* Reuters is not responsible for the content in this press release.

  PITTSBURGH, PA and PLEASANTON, CA, Nov 07 (Marketwire)
The Pittsburgh Supercomputing Center (PSC) and YarcData, a Cray (NASDAQ:
CRAY) company, today announced the deployment of "Sherlock," a uRiKA
graph-analytics appliance from YarcData for efficiently discovering
unknown relationships or patterns "hidden" in extremely large and complex
bodies of information. Funded through the Strategic Technologies for
Cyberinfrastructure (STCI) program of the National Science Foundation,
Sherlock features innovative hardware and software, as well as
PSC-specific enhancements, designed to extend the range of applicability
to scales not otherwise feasible.

    These techniques have been long used by the government and are coming
into wider commercial use. Sherlock will focus on extending the domain of
applicability of these techniques to a wide range of scientific research

    "Sherlock," says Nick Nystrom, PSC director of strategic applications,
"provides a unique capability for discovering new patterns and
relationships in data. It will help to discover how genes work, probe the
dynamics of social networks, and detect the sources of breaches in
Internet security." Those diverse challenges, along with many others, he
adds, have two important features in common: Their data are naturally
expressed as interconnected webs of information called graphs, and data
sizes for problems of real-world interest become extremely large. 

    "Until now, graph analytics has largely been impractical for big data,"
says Nystrom. This is because, he explains, processing of graph
structures requires irregular and unpredictable access to data. On
ordinary computers and clusters, nearly all the time is spent waiting for
that data to move from memory to processors. Even more challenging,
graphs of interest typically cannot be partitioned; their high
connectivity prevents dividing them into subgraphs that can be mapped
independently onto distributed-memory computers. These factors have
precluded large-scale graph analytics, especially for the interactive
response times that analysts need to explore data. "YarcData's uRiKA,"
says Nystrom, "overcomes that barrier through groundbreaking innovations
in computer hardware and software."

    Sherlock enables large-scale, rapid graph analytics through massive
multithreading, a shared address space, sophisticated memory
optimizations, a productive user environment, and support for
heterogeneous applications -- all packaged as an enterprise-ready
appliance. "Sherlock provides researchers with a uniquely powerful tool
for doing complex analytics on big data, expanding the capability to
address problems of societal importance," says Nystrom. 

    "Many current approaches to big data have been about 'search' -- the
ability to efficiently find something that you know is there in your
data," said Arvind Parthasarathi, President of YarcData. "uRiKA was
purposely built to solve the problem of 'discovery' in big data -- to
discover things, relationships or patterns that you don't know exist. By
giving organizations the ability to do much faster hypothesis validation
at scale and in real time, we are enabling the solution of business
problems that were previously difficult or impossible -- whether it be
discovering the ideal patient treatment, investigating fraud, detecting
threats, finding new trading algorithms or identifying counter-party
risk. Basically, we are systematizing serendipity."

    The project complements ongoing leadership in data-intensive computing at
Carnegie Mellon University (CMU). Randal E. Bryant, Dean of the School of
Computer Science at CMU, notes, "We're very pleased that the PSC will
have this new capability for analyzing large-scale, unstructured graphs.
Such data structures pervade many of the big data applications being
investigated by researchers in such diverse areas as biology (e.g., the
connectivity between molecules in a protein), networks (e.g., the
structure of the world-wide web), and artificial intelligence (e.g., the
relationships between different concepts.) The uRiKA system will enable
scientists to deal with far more complex graphs than would otherwise be

    YarcData's uRiKA is a Big Data appliance for graph analytics that enables
enterprises to discover unknown relationships in Big Data. uRiKA is a
highly-scalable, real-time platform that supports ad hoc queries, pattern
based searches, inferencing and deduction. uRiKA is a purpose-built
appliance for graph analytics featuring graph-optimized hardware that
provides up to 512 terabytes of global shared memory,
massively-multithreaded graph processors supporting 128
threads/processor, and an RDF/SPARQL database optimized for the
underlying hardware enabling applications to interact with the appliance
using industry standard interfaces. Singularly focused on graph
analytics, uRiKA augments existing analytical environments by delivering
new high-value discoveries and insights that drive competitive advantage.

    PSC customized Sherlock with additional nodes having standard x86
processors to add valuable support for heterogeneous applications that
use YarcData's Threadstorm nodes as graph accelerators. This
heterogeneous capability will enable an even broader class of
applications, such as genomics, astrophysics, and structural analyses of
complex networks. Sherlock runs an enhanced suite of familiar semantic
web software for easy access to powerful analytic functionality, together
with common programming languages. PSC's Data Supercell provides
complementary, high-performance access to large datasets for ongoing,
collaborative analysis.

    Prototype projects, led by researchers from across the country, will use
Sherlock for research including understanding the natural language of the
Web, learning about human social networks involving different types of
online and telephone interactions, cluster finding in astrophysics, and
genome sequence assembly. For example, Bin Zhang, of the Fox School of
Business at Temple University, notes the potential for Sherlock to expand
his research into clustering in social networks, "With the help of
Sherlock, I can finally observe the true size of social groups in
real-world networks of millions to even a billion people. Researchers
believe that social group size is larger for online social networks than
for traditional groups, but so far it has been impossible to extract
groups from large networks and visualize their structures. Sherlock can
finally enable us to observe the structure of large social groups and
even the whole network." Additional projects will be introduced over
time; more information is available at

    About PSC:
 The Pittsburgh Supercomputing Center is a
joint effort of Carnegie Mellon University and the University of
Pittsburgh together with Westinghouse Electric Company. Established in
1986, PSC is supported by several federal agencies, the Commonwealth of
Pennsylvania and private industry, and is a partner in the National
Science Foundation XSEDE program.

    About YarcData 
 YarcData, a Cray company, delivers business-focused
real-time graph analytics for enterprises to gain business insight by
discovering unknown relationships in Big Data. Early adopters include the
Canadian government, Institute for Systems Biology, Mayo Clinic, Noblis,
Sandia National Laboratories, and the United States government. YarcData
is based in the San Francisco bay area and more information is at

    About Cray Inc.
 As a global leader in supercomputing, Cray provides
highly advanced supercomputers and world-class services and support to
government, industry and academia. Cray technology is designed to enable
scientists and engineers to achieve remarkable breakthroughs by
accelerating performance, improving efficiency and extending the
capabilities of their most demanding applications. Cray's Adaptive
Supercomputing vision is focused on delivering innovative next-generation
products that integrate diverse processing technologies into a unified
architecture, allowing customers to surpass today's limitations and
meeting the market's continued demand for realized performance. Go to for more information.


Shandra Williams
Pittsburgh Supercomputing Center

Nick Davis
YarcData Media

Copyright 2012, Marketwire, All rights reserved.