项目作者： MayankJasoria

项目描述：

  Project on for designing a system to execute a specific kind of INNER JOIN and GROUP BY SQL queries using Hadoop and Spark

高级语言： Java

项目主页：

项目地址: git://github.com/MayankJasoria/Hadoop-Spark-SQL.git

创建时间： 2019-09-04T16:44:24Z
项目社区：https://github.com/MayankJasoria/Hadoop-Spark-SQL
开源协议：MIT License
下载

Hadoop-Spark-SQL

API Endpoints

Base url is http://localhost:8080/cloudproject.
GET request on /api/test : Returns a page displaying Hello World!. Useful for testing if the API is live
POST request on /api/query : Request body should be in application/json format.
- Input Parameter : query (String) -> the SQL query to be run
- Output : application.json containing the required output parameters.

Configuration

Global configuration file: com.project.cloud.Globals

This file contains HDFS output paths, which can be configured.

Project configuration: pom.xml

Maven project file for build settings and dependencies.

Requirements

The project, built into a webapp has been tested on tomcat8.5.
It may be required to configure the amount of memory allocated to Tomcat JVM.

Assumptions

<Condition2> of the WHERE clause in INNER JOIN query is assumed to be an equality operation in one of the columns of the final table.
<COLUMNS> of SELECT and GROUP BY have been assumed to be the same.
Input value of any Aggregate Function, and the value against which it is compared in HAVING clause is assumed to be integer.


