e2fyi.utils.aws.s3_resource
¶
Provides S3Resource to represent resources in S3 buckets.
Module Contents¶
Classes¶
S3Resource represents a resource in S3 currently or a local resource that will |
-
e2fyi.utils.aws.s3_resource.
T
¶
-
e2fyi.utils.aws.s3_resource.
StringOrBytes
¶
-
class
e2fyi.utils.aws.s3_resource.
S3Resource
(filename: str, content_type: str = '', bucketname: str = '', prefix: str = '', protocol: str = 's3a://', stream: S3Stream[StringOrBytes] = None, s3client: boto3.client = None, stats: dict = None, **kwargs)¶ S3Resource represents a resource in S3 currently or a local resource that will be uploaded to S3. S3Resource constructor will automatically attempts to convert any inputs into a S3Stream, but for more granular control S3Stream.from_any should be used instead to create the S3Stream.
S3Resource is a readable stream - i.e. it has read, seek, and close.
Example:
import boto3 from e2fyi.utils.aws import S3Resource, S3Stream # create custom s3 client s3client = boto3.client( 's3', aws_access_key_id=ACCESS_KEY, aws_secret_access_key=SECRET_KEY ) # creates a local copy of s3 resource with S3Stream from a local file obj = S3Resource( # full path shld be "prefix/some_file.json" filename="some_file.json", prefix="prefix/", # bucket to download from or upload to bucketname="some_bucket", # or "s3n://" or "s3://" protocol="s3a://", # uses default client if not provided s3client=s3client, # attempts to convert to S3Stream if input is not a S3Stream stream=S3Stream.from_file("./some_path/some_file.json"), # addition kwarg to pass to `s3.upload_fileobj` or `s3.download_fileobj` # methods Metadata={"label": "foo"} ) print(obj.key) # prints "prefix/some_file.json" print(obj.uri) # prints "s3a://some_bucket/prefix/some_file.json" # will attempt to fix prefix and filename if incorrect filename is provided obj = S3Resource( filename="subfolder/some_file.json", prefix="prefix/" ) print(obj.filename) # prints "some_file.json" print(obj.prefix) # prints "prefix/subfolder/"
Saving to S3:
from e2fyi.utils.aws import S3Resource # creates a local copy of s3 resource with some python object obj = S3Resource( filename="some_file.txt", prefix="prefix/", bucketname="some_bucket", stream={"some": "dict"}, ) # upload obj to s3 bucket "some_bucket" with the key "prefix/some_file.json" # with the json string content. obj.save() # upload to s3 bucket "another_bucket" instead with a metadata tag. obj.save("another_bucket", MetaData={"label": "foo"})
Reading from S3:
from e2fyi.utils.aws import S3Resource from pydantic import BaseModel # do not provide a stream input to the S3Resource constructor obj = S3Resource( filename="some_file.json", prefix="prefix/", bucketname="some_bucket", content_type="application/json" ) # read the resource like a normal file object from S3 data = obj.read() print(type(data)) # prints <class 'str'> # read and load json string into a dict or list # for content_type == "application/json" only data_obj = obj.load() print(type(data_obj)) # prints <class 'dict'> or <class 'list'> # read and convert into a pydantic model class Person(BaseModel): name: str age: int # automatically unpack the dict data_obj = obj.load(lambda name, age: Person(name=name, age=age)) # alternatively, do not unpack data_obj = obj.load(lambda data: Person(**data), unpack=False) print(type(data_obj)) # prints <class 'Person'>
Creates a new instance of S3Resource, which will use boto3.s3.transfer.S3Transfer under the hood to download/upload the s3 resource.
- Args:
filename (str): filename of the object. content_type (str, optional): mime type of the object. Defaults to “”. bucketname (str, optional): name of the bucket the obj is or should be.
Defaults to “”.
- prefix (str, optional): prefix to be added to the filename to get the s3
object key. Defaults to “application/octet-stream”.
protocol (str, optional): s3 client protocol. Defaults to “s3a://”. stream (S3Stream[StringOrBytes], optional): data stream. Defaults to None. s3_client (boto3.client, optional): s3 client to use to retrieve
resource. Defaults to None.
Metadata (dict, optional): metadata for the object. Defaults to None. **kwargs: Any additional args to pass to boto3.s3.transfer.S3Transfer
function.
-
property
content_type
(self) → str¶ mime type of the resource
-
property
key
(self) → str¶ Key for the resource.
-
property
uri
(self) → str¶ URI to the resource.
-
property
stream
(self) → S3Stream[StringOrBytes]¶ data stream for the resource.
-
read
(self, size=-1) → StringOrBytes¶ duck-typing for a readable stream.
-
seek
(self, offset: int, whence: int = 0) → int¶ duck-typing for readable stream. See https://docs.python.org/3/library/io.html
Change the stream position to the given byte offset. offset is interpreted relative to the position indicated by whence. The default value for whence is SEEK_SET. Values for whence are:
- SEEK_SET or 0 – start of the stream (the default); offset should be zero
or positive
SEEK_CUR or 1 – current stream position; offset may be negative
SEEK_END or 2 – end of the stream; offset is usually negative
Return the new absolute position.
-
close
(self) → 'S3Resource'¶ Close the resource stream.
-
get_value
(self) → StringOrBytes¶ Retrieve the entire contents of the S3Resource.
-
load
(self, constructor: Callable[..., T] = None, unpack: bool = True) → Union[dict, list, T]¶ load the content of the stream into memory using json.loads. If a constructor is provided, it will be used to create a new object. Setting unpack to be true will unpack the content when creating the object with the constructor (i.e. * for list, ** for dict)
- Args:
- constructor (Callable[…, T], optional): A constructor function.
Defaults to None.
- unpack (bool, optional): whether to unpack the content when passing
it to the constructor. Defaults to True.
- Raises:
TypeError: [description]
- Returns:
Union[dict, list, T]: [description]
-
save
(self, bucketname: str = None, s3client: boto3.client = None, **kwargs) → 'S3Resource'¶ Saves the current S3Resource to the provided s3 bucket (in constructor or in arg). Extra args can be pass to boto3.s3.transfer.S3Transfer via keyword arguments of the same name.
- Args:
- bucketname (str, optional): bucket to save the resource to. Overwrites
the bucket name provided in the constructor. Defaults to None.
- s3client (boto3.client, optional): custom s3 client to use. Defaults to
None.
**kwargs: additional args to pass to boto3.s3.transfer.S3Transfer.
- Raises:
ValueError: “S3 bucket name must be provided.”
- Returns:
S3Resource: S3Resource object.