[This article was first published on Yet Another Blog in Statistical Computing » S+/R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Similar to SQLDF package providing a seamless interface between SQL statement and R data.frame, PANDASQL allows python users to use SQL querying Pandas DataFrames.
Below are some examples showing how to use PANDASQL to do SELECT / AGGREGATE / JOIN operations. More information is also available on the GitHub (https://github.com/yhat/pandasql).
In [1]: import sas7bdat as sas
In [2]: import pandas as pd
In [3]: import pandasql as pdsql
In [4]: data = sas.SAS7BDAT("accepts.sas7bdat")
In [5]: df = data.toDataFrame()
In [6]: pysql = lambda q: pdsql.sqldf(q, globals())
In [7]: ### SELECT ###
In [8]: str1 = "select bureau_score, ltv from df where bureau_score < 600 and ltv > 100 limit 3;"
In [9]: df1 = pysql(str1)
In [10]: df1
Out[10]:
bureau_score ltv
0 590 103
1 575 120
2 538 113
In [11]: ### AGGREGATE ###
In [12]: str2 = "select ltv, min(bureau_score) as min_score, max(bureau_score) as max_score from df group by ltv order by ltv DESC;"
In [13]: df2 = pysql(str2);
In [14]: df2.head(3)
Out[14]:
ltv min_score max_score
0 176 709 709
1 168 723 723
2 167 688 688
In [15]: ### JOIN ###
In [16]: str3 = "select b.*, a.bureau_score from df a inner join df2 b on a.ltv = b.ltv order by ltv DESC;"
In [17]: df3 = pysql(str3)
In [18]: df3.head(3)
Out[18]:
ltv min_score max_score bureau_score
0 176 709 709 709
1 168 723 723 723
2 167 688 688 688
To leave a comment for the author, please follow the link and comment on their blog: Yet Another Blog in Statistical Computing » S+/R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
