Amazon Launches 'Profoundly Disruptive' Data Warehouse
Amazon took a giant step into cloud-based data warehousing with the launch of RedShift on Wednesday, and the industry will feel its impact. "What I think we're seeing here is evidence of how, when a technology becomes a cloud-based service, it leads to the commoditization of any or all related services," said Charles King, principal analyst at Pund-IT.
Nov 29, 2012 7:00 AM PT
Amazon Web Services on Wednesday launched RedShift, an on-demand data warehouse service that is optimized for the analysis of huge sets of data.
RedShift is "profoundly disruptive," said Merv Adrian, research vice president of information management at Gartner. Its success will move the economic boundary between on-premises and cloud usage and "data will seek its lowest-cost home more rapidly than before."
RedShift allows setup, operation and scaling of a data warehouse cluster through the Web-based AWS Console.
Amazon is inviting prospective users to sign up for a limited preview of RedShift. More than 20 customers, including Netflix and NASA's Jet Propulsion Laboratory have signed up for the preview, Amazon said.
Amazon RedShift offers a relational data warehouse platform that uses columnar storage and data compression to reduce the amount of input/output (I/O) required to perform queries. This reduces the amount of overhead and speeds up queries.
The RedShift service runs on hardware that's optimized for data warehousing, with local attached storage and 10-Gb Ethernet network connections between nodes.
This might sound similar to what Oracle offers with its Exadata database appliance, but "the hardware in question is not constrained to a purpose-built platform at all, unlike the Exadata Storage server," which can only be fully utilized by the Oracle 11gR2 platform, Adrian told TechNewsWorld.
RedShift has a massively parallel processing architecture, which lets users scale up or down without downtime as needed. For example, they can start with a single 2-TB node and scale up to 100 16-TB nodes to get a total of 1.6 petabytes.
Pricing is on demand, and starts at 85 US cents per hour for a 2-TB data warehouse.
Business Intelligence With RedShift
"Now with Jaspersoft on RedShift, developers can embed HTML5 visualizations and analytics from Jaspersoft on any size data, powered by Amazon RDS relational database service or RedShift on the back end," Karl Van den Bergh, vice president of product and alliances at Jaspersoft, told TechNewsWorld.
Target Markets and Other Things
RedShift is for both enterprises and SMBs, Van den Bergh suggested. For enterprises, "it will be an opportunity to consolidate their costs in building and managing a data warehouse. For SMBs, it will be an opportunity to actually start doing data warehousing because of the low cost and the pay-as-you-go model."
The service is highly scalable and cost effective, and it will impact data warehouse vendors, Jack Norris, vice president of marketing at MapR Technologies, told TechNewsWorld.
It will complement MapR's service, which lets users of AWS's Elastic MapReduce (EMR) service manage the open source software framework clusters EMR offers.
"What I think we're seeing here is evidence of how, when a technology becomes a cloud-based service, it leads to the commoditization of any or all related services," Charles King, principal analyst at Pund-IT, said.
"Amazon is essentially saying, 'Bring us your data, whatever the size, and we'll help you attain muscular query and analytics results at an affordable price. Oh, and you won't have to buy anything up front,'" King told TechNewsWorld. "From my perspective, that's not a bad spot for Amazon or its potential customers to be in."
However, that doesn't mean RedShift users will have an easy ride.
"Bear in mind that you don't buy a data warehouse, you build it," Gartner's Adrian pointed out. "RedShift is a platform. Design and deployment still take skills. Amazon's not selling those."
Amazon Web Services did not respond to our request for more details.