Category Archives: Relational

comSysto goes Hauptstadt! Wieso? MongoDB Days in Berlin!

Am 26. Februar machen wir in Berlin die Partnerschaft zwischen 10gen, dem Unternehmen hinter MongoDB, und der comSysto wieder erlebbar. Denn die nächste MongoDB Berlin 2013 steht an und selbstverständlich ist comSysto personell und vor allem inhaltlich wieder maßgeblich an diesem Pflichttermin beteiligt.

Die MongoDB Berlin ist eine jährlich stattfindende User Konferenz rund um die schemafreie, dokumentenorientierte Open-Source-Datenbank MongoDB. Vor eineinhalb Jahren ist comSysto bundesweit als Erster eine strategische Partnerschaft mit 10gen eingegangen. Seit dem sind wir mehrfach als Sponsor der MongoDB Konferenzen aufgetreten, sind Host der Münchner MongoDB User Group und hochkompetenter, erfolgreicher Ansprechpartner für Unternehmen in Deutschland. Kurz gesagt, comSysto ist aktiv mittendrin, statt nur dabei!

So auch wieder auf der MongoDB Berlin 2013 Konferenz am 26. Februar. Zwei unserer Kollegen, Cindy Lamm und Bernd Zuther, werden einen Talk halten über Mario. Mario ist ein bekannter Münchner Pizza-Bäcker und Besitzer eines auf MongoDB basierenden Onlineshops, der seine Cross- und Upselling Rate erhöhen möchte. Um das zu bewerkstelligen, hat Mario eine Statistikerin (nämlich unsere Cindy) engagiert, die Mario eine Recommendation-Engine in seinen Onlineshop einbauen soll. Dieser Vortrag handelt davon, wie man auf Basis von MongoDB und Apache Mahout eine Online-Recommendation-Engine aufbauen kann. Weiterhin werden Cindy und Bernd zeigen welche Recommenders Mario bei seinem Ziel, die Cross- und Upselling Rate zu erhöhen, helfen und wie sie in seine bestehende Infrastruktur integriert werden können. Abschließend wird in einem Realtime-Dashboard die Effizienz der Recommendation-Engine demonstriert.

Als Partner von 10gen und Host der Münchner MongoDB User Group können wir euch vergünstigte Tickets anbieten. Hier geht es zum Promotion Code!

Der Talk von comSysto findet ab 16:15 Uhr im Raum 3 statt. Dieser und alle anderen Talks sind auf der Agenda der MongoDB Berlin 2013 nachzulesen.

Veranstaltungsort:
bcc Berliner Congress Center
Alexanderstr. 11
10178 Berlin

Twitter Hashtag: #MongoDBDays

Tretet unserem MongoDB User Group-Netzwerk bei und lernt mehr über MongoDB und uns! Es lohnt sich.

Feel free to visit comSysto at Devoxx in Antwerp next week!

Next week the annual European Java conference Devoxx will be held in Antwerp, Begium. Together with its partner 10gen, the company behind MongoDB, and Trifork a Dutch full-service supplier of high-quality custom-built applications for organizations, comSysto will be represented with a booth, several talks and much more. From 12th to 16th November Devoxx offers everything a Java Developer’s heart beats for. The conference includes hands-on workshops, labs and BOFs, and presentations, for example by our partner 10gen on topics related to MongoDB, Agile, the benefits of open source and Women in IT. For further information have a look at their blog!

comSysto will be at stand # 11 together with our partners 10gen and friends Trifork. Here you can learn all about the latest innovations in MongoDB. Visit us and gain advice on your future installations or enter our prize draw to win

  1. a free seat to a MongoDB training or
  2. a free seat to the NoSQL Road Show in Amsterdam or
  3. a brand new Nexus 7!

Find out more about the Devoxx schedule here!

10gen is a software company that develops and provides commercial support for the open source database MongoDB, a NoSQL database that stores data in JSON-like documents with flexible schemas.

Trifork Amsterdam is a leading full service supplier of high-quality custom-built applications for organizations primarily in the following sectors: Education & Research, Government & Non-Profit and Profit.

comSysto is a Munich-based software company specialized in lean business and technology development. While supporting all three steps of a well known Build-Measure-Learn lean feedback loop, comSysto focuses on open source frameworks and software as major enablers of short, agile Build-Measure-Learn iterations and fast gains in validated learning. Powerful MongoDB technology provides the needed flexibility and agility for turning ideas into products as well as performance for handling Big Data while turning data into knowledge.

We also enjoy developing with Spring framework and its subprojects, Apache Wicket, Gradle, Git, Oracle DB and Oracle BI. comSysto is dedicated to eliminating waste in both business and technology since 2005.

Conference Venue

Metropolis Antwerp

Business & Communication Centre

Groenendaallaan 394

2030 Antwerp, Belgium

comSysto wieder auf der MongoDB Munich

München, 10. Oktober 2012 – Am Dienstag, den 16. Oktober, ist es wieder soweit. Nach 2011 findet heuer nun zum zweiten Mal die von 10gen organisierte MongoDB Munich Konferenz im Hilton Munich Park Hotel statt. Und auch dieses Jahr ist comSysto, erster deutscher Partner von 10gen, als Sponsor und mit interessanten Vorträgen dabei.

Tag 1, 15. Oktober 2012:
Bereits am Tag vor der Konferenz geht es los. Am Montag, den 15. Oktober, werden MongoDB Workshops angeboten, die alle Module aus den öffentlichen MongoDB Trainings im Schnelldurchlauf abdecken. Ein empfehlenswerter Start in die schöne neue MongoDB-Welt, denn seit kurzem ist das sehnsüchtig erwartete MongoDB Release 2.2, mit vielen Erweiterungen und Verbesserungen verfügbar: Concurrency, Aggregation, TTL-Collections und Tag-Aware sharding, neue Analytics-Fähigkeiten mit dem Aggregation Framework und Hadoop Integration, sowie neuen Strategien für das Deployment über mehrere Rechenzentren.

Nach den Workshops findet das nächste München MongoDB User Group Meetup statt. Ab 18:30 Uhr* spricht im Salon „Van Gogh“ Bernd Zuther, Software Engineer bei comSysto, über „Big Data Solyanka“ – ein Vortrag, der zeigen wird „wie wir mit Apache Hadoop, MongoDB, MongoDB-Hadoop Connector und Spring Data zusammenarbeiten können, um wertvolle Informationen in riesigen Datenmengen zugänglich machen zu können.“ Fürs leibliche Wohl ist ebenfalls gesorgt.

Tag 2, 16. Oktober 2012:
Auf der Mongo DB Munich Konferenz am Dienstag werden sehr interessante Anwenderberichte von einigen Großorganisationen aus München und Umgebung geboten, beispielsweise vom Bayerischen Landesamt für Statistik und Datenverarbeitung. Auch comSysto ist mit zwei überaus interessanten und informativen Vorträgen vertreten.

1. Vortrag von Johannes Brandstetter, Lead DevOps Engineer bei comSysto:
„comSysto erstellt cloudbasierte, datengetriebene Applikationen, sogenannte “Big Data” Applikationen für große Unternehmenskunden. Dieser Vortrag liefert eine Übersicht über Use-Cases die bereits fertiggestellt sind beziehungsweise an denen gerade gearbeitet wird, sowie über den Technologiestack mit MongoDB und AWS als zentrale Komponenten. Die Agilität und Skalierbarkeit von MongoDB und anderen Open-Source Produkten wie Hadoop und Spring innerhalb von AWS ermöglichen das von uns gewählte “Lean & Agile” Vorgehen, um außergewöhnliche Lösungen zu realisieren.

2. Vortrag von Tomislav Zorc, Geschäftsführer und Bernd Zuther, Software Engineer bei comSysto: „Mario, ein Münchner Pizzabäcker, hat eine revolutionäre Methode für die Pizzaproduktion entwickelt, mit der mehrere Tausend Pizzen pro Sekunde entstehen. Wir haben Mario geholfen, seinen Online-Shop sowie das dazugehörige Big Data System mit MongoDB und Hadoop zu entwickeln. In diesem Vortrag führen wir Sie durch eine Live-Demonstration der drei Evolutionsstufen von Marios Softwarearchitektur.“

Unter www.10gen.com/events/mongodb-munich finden Sie alle Vorträge der diesjährigen MongoDB Munich zu den Themen agile Datenbank-Entwicklung, Big Data, NoSQL, Hadoop und natürlich MongoDB. Nach der Konferenz sind alle zum Empfang im Foyer des Hotels, wo auch ein Messestand von comSysto aufgestellt sein wird, eingeladen, um alle Diskussionen in lockerer Atmosphäre bei einem Drink abzuschließen.

Agenda MongoDB Munich:
www.10gen.com/events/mongodb-munich

Veranstaltungsort Workshops und Konferenz am 15./16. Oktober:
Hilton Munich Park Hotel, Am Tucherpark 7

Veranstaltungsort MUG Meetup am 15. Oktober, ab 18:30 Uhr:
Hilton Munich Park Hotel, Am Tucherpark 7, Salon Van Gogh
Für alle Mitglieder der München MongoDB User Group wird für die Konferenz ein Nachlass von 10 % angeboten. Weitere Informationen hierzu finden Sie hier: www.meetup.com/Muenchen-MongoDB-User-Group/events/85093942/

*in einer älteren Version war eine falsche Uhrzeit angegeben. Das Meetup startet definitiv um 18:30 Uhr.

Big Data and Data Science – what’s really new?

Big Data is a hype. It’s also a buzz word. Maybe a trend? Down-to-earth people could say it’s just mass data called “big”. Although there are many very large data warehouses in the BI world, data science seems obsessed with handling “big data – when the size of the data itself becomes party of the problem.” For Gartner and Forrester even “big” is not enough anymore, they started using the term “extreme” and they are right – volume alone is not Big Data.

Big Data is data at extreme scale when it comes to Volume, Velocity, Variety and Variability according to Gartner. Since the word “big” overemphasizes Volume, “extreme” might be the more appropriate term. Anyway, “big” is there, is shorter and sounds better, so let’s stick to it. ;-) Big Data also fits better to big money, extreme money does sound strange, right? According to new study from Wikibon, Big Data pegs revenues at $5B in 2012, surging to more than $50B by 2017.

So what’s really new about Big Data? In order to find an answer we first have to ask ourselves: How come? What lead to this trend? Let’s have a look at some other important and interdependent trends:

“Software is eating the world” and the Internet Revolution
Two decades ago you needed a special training in order to use software systems. Consumers used their Office suites and the few websites out there were only an bunch of static HTML files. Enterprises had their software to support some specific business functions, mostly with relational storage and they just started to put this relational data to use.

The rise of modern Internet started a new trend where all of the technology required to transform industries through software finally works and can be widely delivered at global scale. Today consumers and businesses moved online where more than 2 billion people use the broadband internet and today’s internet is:
- easy to use and everywhere (pervasiveness)
- dynamic, complex and agile (variability)
- extremely large (volume)
- extremely quick (velocity)
- noisy (extracting the message is getting harder)
- vague and uncertain
- not well-structured and diverse (variety)
- not always consistent
- non-relational
- visual
while every single one of these attributes is getting more extreme.

The transformation of Web 1.0 static websites to Web 2.0 web applications is now continuing towards Web 3.0 or Semantic Web where data, their semantics and insights as well as actions derived from that data become the most important part of the internet service.

A Shift in Data
Is Big Data only about Web or Internet Data? Not necessarily, but WWW still is the main driver. Plus the new awareness for an old fact: unlike people, not all data is equal whereas the inequality is even growing. Many new consumer and enterprise apps create data footprints that are constantly growing larger and quicker in more different formats as well as getting more complex. So why treating all data equally? Why would you want to store and process data streams of RFID messages the same way as your business transaction data? Well, only if you have no choice.

Many people talk about unstructured data being Big Data. Thinking about the term “unstructured data” longer than a few seconds opens up following questions: What is data without structure? When does structure end? How can it be interpreted and analyzed?

The answers are: There is no data without structure. If there is absolutely no structure or context, it’s just noise and you can forget about analyzing it. Even a piece of text has a certain structure and context, therefore one can mine it in order to extract the semantics. What most people mean by “unstructured” is data coming from a “non-relational” source with varying structure. After 40 years of dealing with nice and tidy relational data in analytical environments the brave new world surely might seem a bit chaotic and unstructured. But it’s not, it’s just different.

NoSQL – new choice for Data Storage and Processing
In order to efficiently process this kind of data for generating insights and actions, a new set of data management and processing software has emerged. These software technologies are:
- mostly Open-Source and frequently JVM based
- excellent in scaling through massive parallelism on commodity computing capacity
- non-relational
- schemaless
- storing and processing all different kinds of data formats such as JSON, XML, Binary, Text, …
They represent the sofar missing alternative for many use cases such as (complex) event processing, operational intelligence, machine learning, real-time analytics, genetic algorithms, sentiment analysis, etc.

Traditional mass data storage and integration solutions in the domain of Data Warehousing and Business Intelligence are based on relational formats and batch processing running for years on large, expensive and poorly scalable enterprise editions of RDBMS and even more expensive enterprise hardware. As the history has shown many times, it is not always the idea or the use case searching for the right technology (as one would expect it to be), but also the new technology inspiring people when generating ideas and driving innovation.

Looking at the components of a data-driven or analytical application following technologies associated with the term “Big Data” have already taken a leading role:
MongoDB for Data Storage, Real-Time Processing and Operational Intelligence. JSON based, schema-less document oriented DBMS.
Apache Hadoop for ETL/Batch Processing implementing MapReduce algorithm for aggregation
R Project for Statistical Computing and Data Visualization

Hardware and High Performance Cloud Computing
All of the above technologies allow High Performance Computing by supporting high scalability on bunches of commodity hardware. As computing capacity is always getting cheaper and seemingly limitless through different “Cloud” offerings, we don’t have to ask ourselves “Do we really need this data” before storing it. Store first, analyze later is reality today, not only because of cheap hard disk, but also because we have the possibility to add additional computing capacity for a limited time once we want to run our analyses.

It is the combination of the above mentioned trends that sums up in a different way we look at data today. These trends surely depend on and affect each other, but explaining this would lead off the subject. Being a practical person, I would want to get more into details and describe an analytical platform based on the three leading technologies: MongoDB, Apache Hadoop and R. Not now and not here, so stay tuned…

Links

http://en.wikipedia.org/wiki/Big_data

http://wikibon.org/wiki/v/Big_Data_Market_Size_and_Vendor_Revenues

http://online.wsj.com/article/SB10001424053111903480904576512250915629460.html

http://www.forbes.com/sites/ciocentral/2012/06/06/seven-best-practices-for-revolutionizing-your-data/

http://www.forbehttp://www.itnext.in/content/volume-alone-not-big-data-gartner.htmls.com/sites/danwoods/2012/03/08/hilary-mason-what-is-a-data-scientist/

http://www.marketwatch.com/story/big-data-is-big-business-50b-market-by-2012-2012-02-22

http://data-virtualization.com/2011/05/23/gartner-and-forrester-%E2%80%9Cnearly%E2%80%9D-agree-on-extreme-big-data/

http://practicalanalytics.wordpress.com/2011/11/11/big-data-infographic-and-gartner-2012-top-10-strategic-tech-trends/

http://www.jaspersoft.com/bigdata#bigdata-middle-tab-5

http://www.datasciencecentral.com/profiles/blogs/5-big-data-startups-that-matter-platfora-datastax-visual-ly-domo-

http://www.thisisthegreenroom.com/2011/data-science-vs-business-intelligence/

http://tdwi.org/articles/2012/02/07/big-data-killed-data-modeling-star.aspx?utm_source=twitterfeed&utm_medium=twitter

http://blogs.wsj.com/tech-europe/2012/02/10/big-data-demands-new-skills/?mod=google_news_blog

http://www.thisisthegreenroom.com/2011/data-science-vs-business-intelligence/

http://radar.oreilly.com/2010/06/what-is-data-science.html

http://www.citoresearch.com/content/growing-your-own-data-scientists

Munich MongoDB User Group: First Meetup

You are invited to the First Meetup Munich MongoDB User Group!

Date: 6/28/2011
Time: Starting 7pm
Who: Brendan McAdams, 10gen Corp.
Subject: „A MongoDB Tour for the Experienced and Newbie Alike“
Location: Münchner Technologiezentrum, comSysto GmbH, Agnes-Pockels-Bogen 1, D – 80992 Munich
http://www.comsysto.com/
http://twitter.com/#!/comsysto

A Few Facts on MongoDB:
„MongoDB is an open source, document-oriented database designed with both scalability and developer agility in mind. Instead of storing your data in tables and rows as you would with a relational database, in MongoDB you store JSON-like documents with dynamic schemas. The goal of MongoDB is to bridge the gap between key-value stores (which are fast and scalable) and relational databases (which have rich functionality).
Using BSON (binary JSON), developers can easily map to modern object-oriented languages without a complicated ORM layer. This new data model simplifies coding significantly, and also improves performance by grouping relevant data together internally.
MongoDB was created by former DoubleClick Founder and CTO Dwight Merriman and former DoubleClick engineer and ShopWiki Founder and CTO Eliot Horowitz. They drew upon their experiences building large scale, high availability, robust systems to create a new kind of database. MongoDB maintains many of the great features of a relational database — like indexes and dynamic queries. But by changing the data model from relational to document-oriented, you gain many advantages, including greater agility through flexible schemas and easier horizontal scalability.“

Do you want to learn more about MongoDB? Then please register via
http://www.meetup.com/Munchen-MongoDB-User-Group/
or
https://www.xing.com/events/munich-mongodb-user-group-meetup-781984
and give us a visit! The number of participants is unfortunately limited to 50.

For any further information please contact Matija Gasparevic/office@comsysto.com/