Meetup Details and Exact Location Here Join us for the inaugural Apache Sentry meetup at Oracle's offices in NYC, on the evening of the last day of Strata +...
Meetup Details and Exact Location Here Join us for the inaugural Apache Sentry meetup at Oracle's offices in NYC, on the evening of the last day of Strata + Hadoop World 2013 in New York. (@ Oracle Offices, 120 Park Ave, 26th Floor -- Note: Bring your ID and check in with security in the lobby!) We'll kick-off the meetup with the following presentation: Getting Serious about Security with Sentry Presenters: Shreepadma Venugopalan - Lead Engineer for Sentry Arvind Prabhakar...
Meetup Details and Exact Location Here Join us for the inaugural Apache Sentry meetup at Oracle's offices in NYC, on the evening of the last day of Strata + Hadoop World 2013 in New York. (@ Oracle...
Introduction Documentation and most discussions are quick to point out that HDFS provides OS-level permissions on files and directories. However, there is less...
Introduction Documentation and most discussions are quick to point out that HDFS provides OS-level permissions on files and directories. However, there is less readily-available information about what the effects of OS-level permissions are on accessing data in HDFS via higher-level abstractions such as Hive or Pig. To provide a bit of clarity, I decided to run through the effects of permissions on different interactions with HDFS. The Setup In this scenario, we have three...
Introduction Documentation and most discussions are quick to point out that HDFS provides OS-level permissions on files and directories. However, there is less readily-available information about...
There is a lot of hype around big data, but here at Oracle we try to help customers implement big data solutions to solve real business problems. For those of...
There is a lot of hype around big data, but here at Oracle we try to help customers implement big data solutions to solve real business problems. For those of you interested in understanding more about how you can put big data to work at your organization, consider joining these events: San Jose | August 5 - 6 Marriott San Jose 301 S Market St, San Jose, California 95113Event Registration Page Chicago | August 7 - 8 The Westin Michigan Avenue 909 N Michigan Ave, Chicago, IL...
There is a lot of hype around big data, but here at Oracle we try to help customers implement big data solutions to solve real business problems. For those of you interested in understanding more...
A quick update on some of the integration components needed to build things like M2M (Machine 2 Machine communication) and on integrating fast moving data...
A quick update on some of the integration components needed to build things like M2M (Machine 2 Machine communication) and on integrating fast moving data (events) with the Hadoop and NoSQL Database. As of 11.1.1.7 of the Oracle Event Processing product you now have: OEP Data Cartridge for Hadoop (the real doc is here) OEP Data Cartridge for NoSQL Database (the real doc is here) The fun with these products is that you can now model (in a UI!!) how to interact with these...
A quick update on some of the integration components needed to build things like M2M (Machine 2 Machine communication) and on integrating fast moving data (events) with the Hadoop and NoSQL Database....
For those interested in understanding how to actually build a big data solution including things like NoSQL Database, Hadoop, MapReduce, Hive, Pig and Analytics...
For those interested in understanding how to actually build a big data solution including things like NoSQL Database, Hadoop, MapReduce, Hive, Pig and Analytics (data mining, R) have a look at the big data videos Marty did: Video 1: Using Big Data to Improve the Customer Experience Video 2: Using Big Data to Deliver a Personalized Service Video 3: Using Big Data and NoSQL to Manage On-line Profiles Video 4: Oracle Big Data and Hadoop to Process Log Files Video 5: Integrate...
For those interested in understanding how to actually build a big data solution including things like NoSQL Database, Hadoop, MapReduce, Hive, Pig and Analytics (data mining, R) have a look at the big...
Untitled Document This week Oracle announced the availability (yes you can right away buy and use these systems) to Big Data Appliance X3-2 Starter Rack and Big...
Untitled Document This week Oracle announced the availability (yes you can right away buy and use these systems) to Big Data Appliance X3-2 Starter Rack and Big Data Appliance X3-2 In-Rack Expansion. You can read the press release here. For those who are interested in the operating specs, best to look at the data sheet on OTN. So what does this mean? In effect this means that you can now start any big data project with an appliance. Whether you are looking to try your hand on...
Untitled Document This week Oracle announced the availability (yes you can right away buy and use these systems) to Big Data Appliance X3-2 Starter Rack and Big Data Appliance X3-2 In-Rack...
Introduction In the final installment in our series on Hive UDFs, we're going to tackle the least intuitive of the three types: the User Defined Aggregating...
Introduction In the final installment in our series on Hive UDFs, we're going to tackle the least intuitive of the three types: the User Defined Aggregating Function. While they're challenging to implement, UDAFs are necessary if we want functions for which the distinction of map-side v. reduce-side operations are opaque to the user. If a user is writing a query, most would prefer to focus on the data they're trying to compute, not which part of the plan is running a given...
Introduction In the final installment in our series on Hive UDFs, we're going to tackle the least intuitive of the three types: the User Defined Aggregating Function. While they're challenging to...
IntroductionIn our ongoing exploration of Hive UDFs, we've covered the basic row-wise UDF. Today we'll move to the UDTF, which generates multiple rows for...
IntroductionIn our ongoing exploration of Hive UDFs, we've covered the basic row-wise UDF. Today we'll move to the UDTF, which generates multiple rows for every row processed. This UDF built its house from sticks: it's slightly more complicated than the basic UDF and allows us an opportunity to explore how Hive functions manage type checking. We'll step through some of the more interesting pieces, but as before the full source is available on github here.Extending...
IntroductionIn our ongoing exploration of Hive UDFs, we've covered the basic row-wise UDF. Today we'll move to the UDTF, which generates multiple rows for every row processed. This UDF built its...
Introduction In our ongoing series of posts explaining the in's and out's of Hive User Defined Functions, we're starting with the simplest case. Of the three...
Introduction In our ongoing series of posts explaining the in's and out's of Hive User Defined Functions, we're starting with the simplest case. Of the three little UDFs, today's entry built a straw house: simple, easy to put together, but limited in applicability. We'll walk through important parts of the code, but you can grab the whole source from github here. Extending UDF The first few lines of interest are very straightforward: @Description(name = "moving_avg", value...
Introduction In our ongoing series of posts explaining the in's and out's of Hive User Defined Functions, we're starting with the simplest case. Of the three little UDFs, today's entry built a straw...