Friday Mar 27, 2015

Open World 2015 call for papers - my simple guidelines

OOW Banner 2013

Most of you will already have received an email from the OpenWorld team announcing the call for papers for this year’s conference: https://www.oracle.com/openworld/call-for-proposals.html. Each year, a lot of people ask me how they can increase their chances of getting their paper accepted? Well, I am going to start by stating that product managers have absolutely no influence over which papers are accepted - even mentioning that a product manager will be co-presenting with you will not increase your chances! Yes, sad but true!

So how do you make sure that your presentation title and abstract catches the eye of the selection committee? Well, here is my list of top 10 guidelines for submitting a winning proposal...

[Read More]

Friday Sep 26, 2014

Why SQL is becoming the goto language for Big Data analysis

Since the term big data first appeared in our lexicon of IT and business technology it has been intrinsically linked to the no-SQL, or anything-but-SQL, movement. However, we are now seeing that SQL is experiencing a renaissance. The term “noSQL” has softened to a much more realistic approach "not-only-SQL" approach. And now there is an explosion of SQL-based implementations designed to support big data. Leveraging the Hadoop ecosystem, there is: Hive, Stinger, Impala, Shark, Presto and many more. Other NoSQL vendors such as Cassandra are also adopting flavors of SQL. Why is there a growing level of interest in the reemergence of SQL? Probably, a more pertinent question is: did SQL ever really go away? Proponents of SQL often cite the following explanations for the re-emergence of SQL for analysis:

  1. There are legions of developers who know SQL. Leveraging the SQL language allows those developers to be immediately productive.
  2. There are legions of tools and applications using SQL today.
  3. Any platform that provides SQL will be able to leverage the existing SQL ecosystem.

However, despite the virtues of these explanations, they alone do not explain the recent proliferation of SQL implementations. Consider this: how often does the open-source community embrace a technology just because it is the corporate orthodoxy? The answer is: probably not ever. If the open-source community believed that there was a better language for basic data analysis, they would be implementing it. Instead, a huge range of emerging projects, as mentioned earlier, have SQL at their heart The simple conclusion is that SQL has emerged as the de facto language for big data because, frankly, it is technically superior. Let’s examine the four key reasons for this:

  1. SQL is a natural language for data analysis.
  2. SQL is a productive language for writing queries.
  3. SQL queries can be optimised.
  4. SQL is extensible.

1. SQL is a natural language for data analysis.

The concept of SQL is underpinned by the relational algebra - a consistent framework for organizing and manipulating sets of data - and the SQL syntax concisely and intuitively expresses this mathematical system.

Most business users, data analysts and even data scientists think about data within the context of a spreadsheet. If you think about a spreadsheet containing a set of customer orders then what do most people do with that spreadsheet? Typically, they might filter the records to look only at the customer orders for a given region. Alternatively, they might hide some columns: maybe the customer address is not needed for a particular piece of analysis, but the customer name and their orders are important data points. Finally, they might add calculations to compute totals and/or perhaps create a cross tabular report.

Within the language of SQL these are common steps: 1) projections (SELECT), 2) filters and joins (WHERE), and 3) aggregations (GROUP BY). These are core operators in SQL. The vast majority of people have found the fundamental SQL query constructs to be straightforward and readable representation of everyday data analysis operations.

2. SQL is a productive language for writing queries.

When a developer writes a SQL query, he or she simply describes the results that they want. The developer does not have to get into any of the nitty-gritty of describing how to get the results 

This type of approach is often referred to as  'declarative programming,’ and it makes the developer's job easier. Even the simplest SQL query illustrates the benefits of declarative programming:

SELECT day, prcp, temp FROM weather
WHERE city = 'San Francisco' AND prcp > 0.0;

SQL engines may have multiple ways to execute this query (for example, by using an index). Fortunately the developer doesn't need to understand any of the underlying database processing techniques. The developer simply specifies the desired set of data using projections (SELECT) and filters (WHERE).

This is perhaps why SQL has emerged as such an attractive alternative to the MapReduce framework for analyzing HDFS data. MapReduce requires the developer to specify, at each step, how the underlying data is to be processed. For the same “query", the code is longer and more complex in MapReduce. For the vast majority of data analysis requirements, SQL is more than sufficient, and the additional expressiveness of MapReduce introduces complexity without providing significant benefits.


3. SQL queries can be optimized

The fact that SQL is a declarative language not only shields the developer from the complexities of the underlying query techniques, but also gives the underlying SQL engine has a lot of flexibility in how to optimize any given query. 

In a lot of programming languages, if the code runs slow, then it's the programmer's fault. For the SQL language, however, if a SQL query runs slow, then it's the SQL engine's fault.

This is where analytic databases really earn their keep – databases can easily innovate ‘under the covers’ to deliver faster performance; parallelization techniques, query transformations, indexing and join algorithms are just a few key areas of database innovation that drive query performance.

4. SQL is extensible

SQL provides a robust framework that adapts to new requirements

SQL has stayed relevant over the decades because, even though its core is grounded in universal data processing techniques, the language itself can be extended with new processing techniques and new calculations. Simple time-series calculations, statistical functions, and pattern-matching capabilities have all been added to SQL over the years. 

Consider, as a recent example, what many organizations realized as they started to ask queries such as 'how many distinct visitors came to my website last month?' These organizations realized that it is not vital to have a precise answer to this type of query ... an approximate answer (say, within 1%) would be more than sufficient. This has requirement has now been quickly delivered by implementing the existing hyperloglog algorithms within SQL engines for 'approximate count distinct' operations. 

More importantly, SQL is a language that is not explicitly tied to a storage model. While some might think of SQL as synonymous with relational databases, many of the new adopters of SQL are built on non-relational data. SQL is well on its way to being a standard language for accessing data stored in JSON and other serialized data structures.  

Summary

SQL is an immensely popular language today … and if anything its popularity is growing as the language is adopted for new data types and new use cases. The primacy of SQL for big data is not simply a default choice, but a conscious realization that SQL is the best suited language for basic analysis

PS. Next week, many sessions at this year’s OpenWorld will focus on the power, richness and performance of SQL for sophisticated data analysis including the following:

Monday September 28

Using Analytical SQL to Intelligently Explore Big Data @ 4:00PM Moscone North 131

Joerg Otto - Head of Database Engineering, IDS GmbH
Marty Gubar - Director, Oracle
Keith Laker - Senior Principal Product Manager, Data Warehousing and Big Data, Oracle


YesSQL! A Celebration of SQL and PL/SQL @ 6:00PM Moscone South 103

Steven Feuerstein - Architect, Oracle
Thomas Kyte - Architect, Oracle


Tuesday September 29

SQL Is the Best Development Language for Big Data @ 10:45AM Moscone South 104

Thomas Kyte - Architect, Oracle

Enjoy OpenWorld 2014 and if you have time please come and meet the Analytical SQL team in the Moscone South Exhbition Hall. We will be on the Parallel Execution and Advanced SQL Processing demo booth (id 3720).

Friday Sep 12, 2014

Your indispensable guide to DW and Big Data at OpenWorld in iBook and PDF formats

There's so much to see and learn at Oracle OpenWorld because it provides more educational and networking opportunities than any other conference dedicated to Oracle business and technology users. 

What to expect at OOW 2014 - We will be announcing a wide range of continuous data warehouse innovations in both hardware and software. Join Oracle experts as we dive deep into the latest generation of data warehouse innovations for analyzing enterprise data and diverse big data streams to derive real business value. You will also learn data warehouse best practices and hear from customers consolidating business analysis onto a common scalable platform. Hands-on labs are available for both beginners and experts giving you the chance to try some of these innovative data warehouse technologies first-hand.

To help you get the most from this year’s event I have put together a comprehensive downloadable guide of all the data warehousing and big data activities at @OracleOpenWorld 2014. If you are smartphone and/or tablet user then checkout our amazing web apps (see previous post OpenWorld on your iPad and iPhone - Now Fully Operational!). If you don't have a tablet or a suitable smartphone of just want a downloadable booklet then this guide contains everything to help you get the most from this year’s conference, including the following:

  • Overview of OpenWorld - why you have got to be there!
  • Video Guide to Data Warehousing with Oracle Database 12c
  • Comprehensive day-by-day session calendar
  • List of must-see sessions
  • List of hands-on labs
  • Map of all the most important session and lab venues
  • List of demo pods and guide to demo grounds 
  • Comprehensive presenter biographies
  • Profiles for key Data Warehouse customers
  • Live twitter feeds from your data warehouse product managers
  • Links for more information

iBook Cover PDF Cover
Click here to download Guide in Apple iBook format

Please note that this Apple iBook can be used on any Apple Mac computer or iPad running the iBook application. iPod touch and iPhone users should use the PDF version of this guide.
Click here to download Guide in PDF format

Enjoy @OracleOpenWorld 2014  and if you have time please stop by the Parallel Execution and Analytical SQL demo booth in the demo grounds and say hello.

Thursday Sep 11, 2014

OpenWorld on your iPad and iPhone - Now Fully Operational!

In one of my recent blog post I provided links to our OpenWorld data warehouse web app for smartphones and tablets. Now that the OOW team has released the hands-on lab schedule (it is now live on the OpenWorld site) I have updated my smartphone and tablet apps to include the list of hands-on labs on a day-by-day basis (Monday Tuesday, Wednesday, Thursday). The list of hands-on labs can still be viewed in subject area order (data warehousing and big data) within the app via the “Switch to subject view” link in the top left part of the screen. 

iPad-Labs.jpg iPhone-Labs.jpg

I have also added a location map which can be viewed by clicking on the linked-text, “View location map", which is in the top right part of the screen on each application. The location map that is available within both the tablet and smartphone apps is shown below:

Oow locations

If you want to run these updated web apps on your smartphone and/or tablet then you can reuse the existing links that I published on my last blog post. If you missed that post then  follow these links:

Android users: I have tested the app on Android and there appears to be a bug in the way the Chrome browser displays frames since scrolling within frames does not work . The app does work correctly if you use either the Android version of the Opera browser or the standard Samsung browser on Samsung devices. 

Please note that I have also published online calendars (via my Google account) which can viewed via the following blog posts:


If you have any comments about the app (content you would like to see) then please let me know. Enjoy OpenWorld and, if you have time, it would be great to see you if you want to stop by at the Parallel Execution and Analytical SQL demo booth.

Wednesday Sep 10, 2014

Online Calendar for Data Warehousing Hands-on Labs at OpenWorld now available

There so many exciting hands-on labs at this years OpenWorld conference and the schedule builder is now live so you can start booking your seat at these labs. To help you get organized and pick the most useful labs to attend I have published a new shared online calendar that contains all the most important data warehouse and big data hands-on labs at this year’s OpenWorld. The following links will allow you to add this shared calendar to your own calendar application:

HOL-Calendar.jpg

Hope this helps you get organised for this year’s incredible conference. Any comments then let me know. The online calendar for all the most important data warehousing and big sessions is available via my previous blog post “Online Calendar for Data Warehousing Sessions at OpenWorld now available”. 

Enjoy.

Thursday Sep 04, 2014

Online Calendar for Data Warehousing Sessions at OpenWorld now available

I have published a shared online calendar that contains all our data warehouse and big data sessions at this year’s OpenWorld. The following links will allow you to add this shared calendar to your own calendar application:

oow-calendar

As soon as the dates and times for the keynote sessions are published I will add these to the calendar so keep checking for updates.

Hope this helps you get organised for this year’s incredible conference. Any comments then let me know and if I missed your data warehouse/big data session then let me know and I will add it to the calendar.

Enjoy.

Monday Aug 25, 2014

OpenWorld on your iPad and iPhone

Most of you probably know that each year I publish a data warehouse guide for OpenWorld which contains links to the latest data warehouse videos, a calendar for the most important sessions and labs and a section that provides profiles and relevant links for all the most important data warehouse presenters. For this year’s conference I have made all this information available via an HTML app that runs on most smartphones and tablets. The pictures below show the HTML app running on iPad and iPhone.

IPad OOW IPhone OOW

This exciting new web app contains information about why you should attend OpenWorld - just in case you have not yet booked your ticket! - as well as the following information:

  • Getting to know 12c - a series of video interviews with George Lumpkin, Vice President of Data Warehouse Product Management
  • Your presenters - full biographies and links to social media sites for all the key data warehouse presenters
  • Must sees sessions - list of all the most important data warehouse presentations at this year’s conference
  • Our customers - profiles our most important data warehouse customers
  • Must attend labs - list of all the most important data warehouse hands-on labs at this year’s conference
  • Links - a list of links to the most important data warehouse sites

If you want to run these web apps on your smartphone and/or tablet then follow these links:

Android users: I have tested the app on Android and there appears to be a bug in the way the Chrome browser displays iframes because scrolling does not work . The app does work correctly if you use either the Android version of the Opera browser or the standard Samsung browser.

If you have any comments about the app (content you would like to see) then please let me know. Enjoy OpenWorld.

Monday Jun 30, 2014

My Data Warehousing and Big Data Must-See Guide for OpenWorld

Data Warehousing Must-See Session Guide for Open World 2014
IBook2014

There’s so much to learn at this year’s Oracle OpenWorld - it provides more educational and networking opportunities than any other conference dedicated to Oracle business and technology users. To get the most from this year’s event I have prepared an initial guide which lists all the must important data warehousing and big data sessions. The first part of the guide provides links to videos and content from last year’s event so you can either re-live the highlights from last year or see what you missed by not being there! If you have an iPad or use OSX 10.9 then you will want to download the iBook version because this contains video highlights from last year’s database keynote session.

The session guide is divided into the following chapters:

  • Data Warehousing
  • Performance and Scalability
  • Database Analytics
  • Industry Data Models and Data Integration
  • Unstructured Data
  • Big Data

The last part of the booklet lists all the key data warehouse and big data product managers who will be presenting at this year’s conference. Included alongside each speaker’s biography are links to their social media sites and blogs.

Please note that, as usual, there will be hands-on labs at this year’s OpenWorld but these have not been included at in the session catalog. I am expecting them to be added shortly. Details of all our labs will appear in the more detailed guide that I will publish before the conference begins.

The Must-See guide is available in two formats: 

For iPad and OSX 10.9 (Mavericks) users please download the iBook, which is available here: https://dl.dropboxusercontent.com/u/69436560/OOW/Must-See%20OpenWorld%202014_Sessions.ibooks

For all other platforms please download the PDF, which is available here: https://dl.dropboxusercontent.com/u/69436560/OOW/Must-See%20OpenWorld%202014_Sessions.pdf

 Hope to see you in September at OpenWorld. 

Tuesday Apr 15, 2014

OpenWorld call for Papers closes today!

 Just a gentle reminder - if you have not submitted a paper for this year's OpenWorld conference then there is still just enough time because the deadline is Today (Tuesday, April 15) at 11:59pm PDT. The call for papers website is here http://www.oracle.com/openworld/call-for-papers/index.html and this provides all the details of how and what to submit.

I have been working with a number of customers on some really exciting papers so I know this year's conference is going to be really interesting for data warehousing and analytics. I would encourage everyone to submit a paper, especially if you have never done this before. Right now both data warehousing and analytics are among the hottest topics in IT and I am sure all of you have some great stories that you could share with your industry peers who will be attending the conference. It is a great opportunity to present to your peers and also learn from them by attending their data warehouse/analytics sessions during this week long conference. And of course you get a week of glorious Californian sunshine and the chance to spend time in one of the World's most beautiful waterfront cities.

If you would like any help submitting a proposal then feel free to email during today and I will do my best to provide answers and/or guidance. My email address is keith.laker@oracle.com.

Have a great day and get those papers entered into our OpenWorld system right now! 

Friday Mar 21, 2014

Open World 2014 - guidelines for call-for-papers…

OOW Banner 2013

Most of you will already have received an email from the OOW team announcing the call for papers for this year's conference: http://www.oracle.com/openworld/call-for-papers/index.html. Each year, customers ask me how they can increase their chances of getting their paper accepted? Well, I am going to start by stating that product managers have absolutely no influence over which papers are accepted - even mentioning that a product manager will be co-presenting with you will not increase your chances!

So how do you increase you make sure that your presentation title and abstract catches the eye of the selection committee? Here is my top 10 list of guidelines for submitting proposals:

1) Read the "call-for-papers" carefully and follow its instructions - even if you have submitted presentations for lots of Oracle conferences it is always a good idea to carefully read the call for papers and to make sure you follow the instructions. There is an excellent section towards the end of the call-for-papers web page, "Tips and Guidelines"

2) Address the theme of the conference - If this is available when the call the for papers is announced then try to address the theme of the conference within your abstract.

3) Address the key data warehouse focus areas - for this year's OOW 2014 the key focus areas for data warehousing will be partitioning, analytical SQL, parallel execution, workload management and logical data warehouse. If possible try to include one or more of these focus areas within your abstract.

4) Have a strong biography - You need to use your biography to differentiate and build credibility. This is an important topic because it allows you to differentiate yourself from all the other presenters who are trying to get speaking slots. Your biography must explain why you are an authority on the topic you have chosen for your presentation and why people will want to listen to what you have to say.

5) Have a strong business case - build your presentation around a strong business case, relevant to your industry and/or your target audience (DBAs, developers, architects etc). Try to explain in clear and simple terms the problem you needed to solve, how you solved it using Oracle technology and the direct technical/business benefits.

6) Make the title and abstract interesting - Your title and abstract must be easy to read and make sure you introduce your main idea as early as possible. Review the titles and abstracts from previous conferences as a guide. Ideally make the issue relevant to the delegates attending OWW, get to the point, and make sure it is easy to read.

7) Look at previous presentations - the content catalog for last year's conference is available online,see here:https://oracleus.activeevents.com/2013/connect/search.ww?eventRef=openworld. You can review all the titles and abstracts that were accepted and use them as guidelines for creating your own title and abstract for this year's conference.

8) Write clear outcomes - The majority of the best presentations have clearly stated outcomes. What do you expect that conference attendees will be able do or know at the end of your session? Consider including a sentence at the end of your abstract such as the following: “At the end of this presentation, participants will be able to . . . .”

9) Don’t submit your paper right away - Once you have a title and abstract show it to a few colleagues. Get some feedback. You probably know many people who’d be happy to give you ideas on making your paper better.

10) Keep number of submissions low - You do not increase your chances of getting a paper accepted by submitting lots of different papers.

I cannot guarantee you success if you follow these guideline but I hope they prove helpful. Good luck with your submission(s) and I look forward to seeing at you at this year's OpenWorld in San Francisco.

OOW2014

Wednesday Nov 13, 2013

ADNOC talks about 50x increase in performance

In this new video Awad Ahmen Ali El Sidddig, Senior DBA at ADNOC, talks about the impact that Exadata has had on his team and the whole business. ADNOC is using our engineered systems to drive and manage all their workloads: from transaction systems to payments system to data warehouse to BI environment. A true Disk-to-Dashboard revolution using Engineered Systems. This engineered approach is delivering 50x improvement in performance with one queries running 100x faster! The IT has even revolutionised some of their data warehouse related processes with the help of Exadata and now jobs that were taking over 4 hours now run in a few minutes.

[Read More]

Tuesday Oct 22, 2013

OOW content for Pattern Matching....

If you missed my sessions at OpenWorld then don't worry - all the content we used for pattern matching (presentation and hands-on lab) is now available for download.

My presentation "SQL: The Best Development Language for Big Data?" is available for download from the OOW Content Catalog, see here: https://oracleus.activeevents.com/2013/connect/sessionDetail.ww?SESSION_ID=9101

For the hands-on lab ("Pattern Matching at the Speed of Thought with Oracle Database 12c") we used the Oracle-By-Example content. The OOW hands-on lab uses Oracle Database 12c Release 1 (12.1) and uses the MATCH_RECOGNIZE clause to perform some basic pattern matching examples in SQL. This lab is broken down into four main steps:
  • Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses.
  • Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. These patterns use regular expressions syntax, a powerful and expressive feature, applied to the pattern variables you define.
  • Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause.
  • Define measures, which are expressions usable in the MEASURES clause of the SQL query.
You can download the setup files to build the ticker schema and the student notes from the Oracle Learning Library. The direct link to the example on using pattern matching is here: http://apex.oracle.com/pls/apex/f?p=44785:24:0::NO:24:P24_CONTENT_ID,P24_PREV_PAGE:6781,2.

Wednesday Oct 09, 2013

Big Data Openworld Sessions now available for Download/Viewing

For those who did go to Openworld, the session catalog now has the download links to the session materials online. You can now refresh your memory and share your experience with the rest of your organization. For those who did not go, here is your chance to look over some of the materials.

On the big data side, here are some of the highlights:

There are a great number of other sessions, simply look for: Solutions => Big Data and Business Analytics and you will find a wealth of interesting content around big data, Hadoop and analytics.

Wednesday Mar 13, 2013

Call for Papers: Oracle Openworld

Yes, it is almost time to book those trips to San Francisco, and here is the chance to speak at Openworld. The call for papers is now open:

Go here to submit your paper.

Oh, and don't wait too long, this opportunity of a life time closes Friday April 12, 11:59 PDT!!


Wednesday Sep 12, 2012

Big Data Sessions at Openworld 2012

If you are coming to San Francisco, and you are interested in all the aspects to big data, this Focus On Big Data is a must have document. 

Some (other) highlights:

  • A performance demo of a full rack Big Data Appliance in the engineered systems showcase
  • A set of handson labs on how to go from a NoSQL DB to an effective analytics play on big data
  • Much, much more

See you all in a few weeks in SF!

About

The data warehouse insider is written by the Oracle product management team and sheds lights on all thing data warehousing and big data.

Search

Archives
« July 2015
SunMonTueWedThuFriSat
   
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
       
Today