本产品推出了新版本。
简体中文版经机器翻译而成,仅供参考。如与英语版出现任何冲突,应以英语版为准。
SelectObjectContent
贡献者
建议更改
您可以使用 S3 SelectObjectContent 请求根据简单的 SQL 语句筛选 S3 对象的内容。
有关详细信息,请参见 "适用于 SelectObjectContent 的 AWS 文档"。
您需要的内容
-
此租户帐户具有 S3 Select 权限。
-
您对要查询的对象具有
s 3 : GetObject
权限。 -
要查询的对象为 CSV 格式,或者为包含 CSV 格式文件的 GZIP 或 bzip 2 压缩文件。
-
SQL 表达式的最大长度为 256 KB 。
-
输入或结果中的任何记录的最大长度为 1 MiB 。
请求语法示例
POST /{Key+}?select&select-type=2 HTTP/1.1
Host: Bucket.s3.abc-company.com
x-amz-expected-bucket-owner: ExpectedBucketOwner
<?xml version="1.0" encoding="UTF-8"?>
<SelectObjectContentRequest xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Expression>string</Expression>
<ExpressionType>string</ExpressionType>
<RequestProgress>
<Enabled>boolean</Enabled>
</RequestProgress>
<InputSerialization>
<CompressionType>GZIP</CompressionType>
<CSV>
<AllowQuotedRecordDelimiter>boolean</AllowQuotedRecordDelimiter>
<Comments>#</Comments>
<FieldDelimiter>\t</FieldDelimiter>
<FileHeaderInfo>USE</FileHeaderInfo>
<QuoteCharacter>'</QuoteCharacter>
<QuoteEscapeCharacter>\\</QuoteEscapeCharacter>
<RecordDelimiter>\n</RecordDelimiter>
</CSV>
</InputSerialization>
<OutputSerialization>
<CSV>
<FieldDelimiter>string</FieldDelimiter>
<QuoteCharacter>string</QuoteCharacter>
<QuoteEscapeCharacter>string</QuoteEscapeCharacter>
<QuoteFields>string</QuoteFields>
<RecordDelimiter>string</RecordDelimiter>
</CSV>
</OutputSerialization>
<ScanRange>
<End>long</End>
<Start>long</Start>
</ScanRange>
</SelectObjectContentRequest>
SQL 查询示例
此查询可从美国人口统计数据中获取状态名称, 2010 年人口, 2015 年估计人口以及变更百分比。文件中不属于状态的记录将被忽略。
SELECT STNAME, CENSUS2010POP, POPESTIMATE2015, CAST((POPESTIMATE2015 - CENSUS2010POP) AS DECIMAL) / CENSUS2010POP * 100.0 FROM S3Object WHERE NAME = STNAME
要查询的文件的前几行 SUB-EST2020_all.csv
如下所示:
SUMLEV,STATE,COUNTY,PLACE,COUSUB,CONCIT,PRIMGEO_FLAG,FUNCSTAT,NAME,STNAME,CENSUS2010POP, ESTIMATESBASE2010,POPESTIMATE2010,POPESTIMATE2011,POPESTIMATE2012,POPESTIMATE2013,POPESTIMATE2014, POPESTIMATE2015,POPESTIMATE2016,POPESTIMATE2017,POPESTIMATE2018,POPESTIMATE2019,POPESTIMATE042020, POPESTIMATE2020 040,01,000,00000,00000,00000,0,A,Alabama,Alabama,4779736,4780118,4785514,4799642,4816632,4831586, 4843737,4854803,4866824,4877989,4891628,4907965,4920706,4921532 162,01,000,00124,00000,00000,0,A,Abbeville city,Alabama,2688,2705,2699,2694,2645,2629,2610,2602, 2587,2578,2565,2555,2555,2553 162,01,000,00460,00000,00000,0,A,Adamsville city,Alabama,4522,4487,4481,4474,4453,4430,4399,4371, 4335,4304,4285,4254,4224,4211 162,01,000,00484,00000,00000,0,A,Addison town,Alabama,758,754,751,750,745,744,742,734,734,728, 725,723,719,717
AWS-CLI 使用示例
aws s3api select-object-content --endpoint-url https://10.224.7.44:10443 --no-verify-ssl --bucket 619c0755-9e38-42e0-a614-05064f74126d --key SUB-EST2020_ALL.csv --expression-type SQL --input-serialization '{"CSV": {"FileHeaderInfo": "USE", "Comments": "#", "QuoteEscapeCharacter": "\"", "RecordDelimiter": "\n", "FieldDelimiter": ",", "QuoteCharacter": "\"", "AllowQuotedRecordDelimiter": false}, "CompressionType": "NONE"}' --output-serialization '{"CSV": {"QuoteFields": "ASNEEDED", "QuoteEscapeCharacter": "#", "RecordDelimiter": "\n", "FieldDelimiter": ",", "QuoteCharacter": "\""}}' --expression "SELECT STNAME, CENSUS2010POP, POPESTIMATE2015, CAST((POPESTIMATE2015 - CENSUS2010POP) AS DECIMAL) / CENSUS2010POP * 100.0 FROM S3Object WHERE NAME = STNAME" changes.csv
输出文件的前几行 changes.csv
如下所示:
Alabama,4779736,4854803,1.5705260708959658022953568983726297854 Alaska,710231,738430,3.9703983633493891424057806544631253775 Arizona,6392017,6832810,6.8959922978928247531256565807005832431 Arkansas,2915918,2979732,2.1884703204959810255295244928012378949 California,37253956,38904296,4.4299724839960620557988526104449148971 Colorado,5029196,5454328,8.4532796097030221132761578590295546246