data:image/s3,"s3://crabby-images/a9d38/a9d38112068fc3d3a58bbee6ab16cc7b45bf8983" alt=""
Wednesday, December 18, 2019
Compare two SQL queries with pandas dataframe comparison
Compare the the two queries return the same data:
Note that you do not need step 2&3 if the columns have been ordered in SQL query statement, e.g.:
- query 1 from QA
- query 2 from PROD
Both queries use the same query statement:
SELECT * FROM MY_HANA_VIEW WHERE MY_CONDITION
1. Get data from DB
df_qa = pd.read_sql_query(QUERY_DIM_LEI_RATING, engine_qa)
df_prod = pd.read_sql_query(QUERY_DIM_LEI_RATING, engine_prod)
2. Sort by columns in place
columns_list = df_qa.columns.values.tolist()
df_qa.sort_values(by=columns,inplace=True)
df_prod.sort_values(by=columns, inplace = True)
3. Reset index in place
Drop the existing index and replace with the reset one
df_qa.reset_index(drop=True, inplace=True)
df_prod.reset_index(drop=True, inplace=True)
4. Assert frame equal
assert_frame_equal(df_qa, df_prod)
Note that you do not need step 2&3 if the columns have been ordered in SQL query statement, e.g.:
SELECT c1, c2, c3 FROM MY_HANA_VIEW WHERE MY_CONDITION order by c1, c2, c3
Wednesday, December 4, 2019
Load Thomson Reuters LEI to HANA with SAP DataService + Python
This blog is to consume TR REST API using SAP DS and Python to load TR LEI information to HANA database:
1. Create a DataFlow with 3 objects
SQL: query HANA view to get the LEI identifiers which will be put into the payload of REST API
User Defined Base Transform: this is where the Python code accessing REST API and processing coming response
Table: the database table to save the data
data:image/s3,"s3://crabby-images/cbd69/cbd690df8057fe27d0699b3523aee9e56710bd82" alt=""
2. Qquery HANA view to get the LEI identifiers
data:image/s3,"s3://crabby-images/09c46/09c46a0777a1b0763062b386a17ad7375cdcdedc" alt=""
3. Set the input for Python processing
data:image/s3,"s3://crabby-images/59e01/59e0183a55c6ca5ae69d0d88dfcd4c0199f5d98f" alt=""
4. Bring up the "User Defined Editor", here it's Python
data:image/s3,"s3://crabby-images/3985e/3985e97b8580ab4d14cb2e908441269abafbcd26" alt=""
5. Set up the output of Python processing
data:image/s3,"s3://crabby-images/ec945/ec9457c535c53c631ff79e7e20f7db49ab5576b5" alt=""
We'll use 'Per Collection' mode
data:image/s3,"s3://crabby-images/78546/7854613e785aabe8b88d09af65334b626689e534" alt=""
Save the final data to Collection(the data records collection, this is the output data)
data:image/s3,"s3://crabby-images/d10a4/d10a4a1f9fef9cbe70793ebb897373bbc9752902" alt=""
With a solution using Python in DS, it is flexible and powerful for data loading and processing.
The only thing I dislike the is the integrated Python editor in SAP DS.
Note that SAP DS 4.2 support Python 2.7 only. Also the default library accessing REST API is urllib/urllib2. I'd like to install 'pip' and then install 'requests' for REST API consumption.
1. Create a DataFlow with 3 objects
SQL: query HANA view to get the LEI identifiers which will be put into the payload of REST API
User Defined Base Transform: this is where the Python code accessing REST API and processing coming response
Table: the database table to save the data
data:image/s3,"s3://crabby-images/cbd69/cbd690df8057fe27d0699b3523aee9e56710bd82" alt=""
2. Qquery HANA view to get the LEI identifiers
data:image/s3,"s3://crabby-images/09c46/09c46a0777a1b0763062b386a17ad7375cdcdedc" alt=""
3. Set the input for Python processing
data:image/s3,"s3://crabby-images/59e01/59e0183a55c6ca5ae69d0d88dfcd4c0199f5d98f" alt=""
4. Bring up the "User Defined Editor", here it's Python
data:image/s3,"s3://crabby-images/3985e/3985e97b8580ab4d14cb2e908441269abafbcd26" alt=""
5. Set up the output of Python processing
data:image/s3,"s3://crabby-images/ec945/ec9457c535c53c631ff79e7e20f7db49ab5576b5" alt=""
We'll use 'Per Collection' mode
data:image/s3,"s3://crabby-images/78546/7854613e785aabe8b88d09af65334b626689e534" alt=""
Save the final data to Collection(the data records collection, this is the output data)
data:image/s3,"s3://crabby-images/d10a4/d10a4a1f9fef9cbe70793ebb897373bbc9752902" alt=""
With a solution using Python in DS, it is flexible and powerful for data loading and processing.
The only thing I dislike the is the integrated Python editor in SAP DS.
Note that SAP DS 4.2 support Python 2.7 only. Also the default library accessing REST API is urllib/urllib2. I'd like to install 'pip' and then install 'requests' for REST API consumption.
Subscribe to:
Posts (Atom)