Evernote has over 37 million users who store billions of notes. One of its popular apps is Hello where users scan business card info and store photos as an extension of their own memory. All this leads to a massive data store.
As they explain on their blog
“The online application has generated a continuous stream of structured logs, which record high-level activity performed via our APIs from clients and web interfaces. These logs have grown with time and usage so that now we’re recording nearly 200 million events per day, with more than 66 billion events since our launch in 2008.”
“We decided to split the analytics environment into three tiers: a ten-node Hadoop cluster for warehousing and cleaning the primary data, a three-node ParAccel columnar database for queries over sets of derived tables, and a JasperReports server for generating periodic reports full of pretty graphs and tables.”
Comments