I am trying to capture (S3) logs in a structured way. I am capturing the access-related elements with this type of tuple:
class _Access(NamedTuple):
time: datetime
ip: str
actor: str
request_id: str
action: str
key: str
request_uri: str
status: int
error_code: str
I then have a class that uses this named tuple as follows (edited just down to relevant code):
class Logs:
def __init__(self, log: str):
raw_logs = match(S3_LOG_REGEX, log)
if raw_logs is None:
raise FormatError(log)
logs = raw_logs.groups()
timestamp = datetime.strptime(logs[2], "%d/%b/%Y:%H:%M:%S %z")
http_status = int(logs[9])
access = _Access(
timestamp,
logs[3],
logs[4],
logs[5],
logs[6],
logs[7],
logs[8],
http_status,
logs[10],
)
self.access = access
The problem is that it is too verbose when I now want to use it:
>>> log_struct = Logs(raw_log)
>>> log_struct.access.action # I don't want to have to add `access`
As I mention above, I'd rather be able to do something like this:
>>> log_struct = Logs(raw_log)
>>> log_struct.action
But I still want to have this clean named tuple called _Access
. How can I make everything from access
available at the top level?
Specifically, I have this line:
self.access = access
which is giving me that extra "layer" that I don't want. I'd like to be able to "unpack" it somehow, similar to how we can unpack arguments by passing the star in *args
. But I'm not sure how I can unpack the tuple in this case.
CodePudding user response:
What you really need for your use case is an alternative constructor for your NamedTuple
subclass to parse a string of a log entry into respective fields, which can be done by creating a class method that calls the __new__
method with arguments parsed from the input string.
Using just the fields of ip
and action
as a simplified example:
from typing import NamedTuple
class Logs(NamedTuple):
ip: str
action: str
@classmethod
def parse(cls, log: str) -> 'Logs':
return cls.__new__(cls, *log.split())
log_struct = Logs.parse('192.168.1.1 GET')
print(log_struct)
print(log_struct.ip)
print(log_struct.action)
This outputs:
Logs(ip='192.168.1.1', action='GET')
192.168.1.1
GET
CodePudding user response:
I agree with @blhsing and recommend that solution. This is assuming that there are not extra attributes required to be apply to the named tuple (say storing the raw log value).
If you really need the object to remain composed, another way to support accessing the properties of the _Access
class would be to override the __getattr__
method [PEP 562] of Logs
The
__getattr__
function at the module level should accept one argument which is the name of an attribute and return the computed value or raise an AttributeError:def __getattr__(name: str) -> Any: ...
If an attribute is not found on a module object through the normal lookup (i.e.
object.__getattribute__
), then__getattr__
is searched in the module__dict__
before raising anAttributeError
. If found, it is called with the attribute name and the result is returned. Looking up a name as a module global will bypass module__getattr__
. This is intentional, otherwise calling__getattr__
for builtins will significantly harm performance.
E.g.
from typing import NamedTuple, Any
class _Access(NamedTuple):
foo: str
bar: str
class Logs:
def __init__(self, log: str) -> None:
self.log = log
self.access = _Access(*log.split())
def __getattr__(self, name: str) -> Any:
return getattr(self.access, name)
When you request an attribute of Logs
which is not present it will try to access the attribute through the Logs.access
attribute. Meaning you can write code like this:
logs = Logs("fizz buzz")
print(f"{logs.log=}, {logs.foo=}, {logs.bar=}")
logs.log='fizz buzz', logs.foo='fizz', logs.bar='buzz'
Note that this would not preserve the typing information through to the Logs
object in most static analyzers and autocompletes. That to me would be a compelling enough reason not to do this, and continue to use the more verbose way of accessing values as you describe in your question.
If you still really need this, and want to remain type safe. Then I would add properties to the Logs
class which fetch from the _Access
object.
class Logs:
def __init__(self, log: str) -> None:
self.log = log
self.access = _Access(*log.split())
@property
def foo(self) -> str:
return self.access.foo
@property
def bar(self) -> str:
return self.access.bar
This avoids the type safty issues, and depending on how much code you write using the Logs
instances, still can cut down on other boilerplate dramatically.