Skip to content

Dataclasses

First PublishedLast UpdatedByAtif Alam

Dataclasses are a way to define classes that mainly hold data, with less boilerplate. You declare attributes with type hints; Python generates __init__, __repr__, and (by default) __eq__ for you. They live in the standard library: from dataclasses import dataclass. Requires Python 3.7+.

Fields can use any types you would annotate elsewhere—str, int, float, bool, datetime.date, optional types, nested dataclasses, and so on:

from dataclasses import dataclass
from datetime import date
@dataclass
class ServiceRecord:
vehicle_id: str
service_type: str
serviced_on: date
odometer_km: int
cost_usd: float
warranty_active: bool
rec = ServiceRecord(
vehicle_id="VH-204",
service_type="Oil change",
serviced_on=date(2026, 2, 15),
odometer_km=45200,
cost_usd=89.99,
warranty_active=True,
)
print(rec)
# ServiceRecord(vehicle_id='VH-204', service_type='Oil change', serviced_on=2026-02-15, odometer_km=45200, cost_usd=89.99, warranty_active=True)
  • The @dataclass decorator tells Python to generate __init__, __repr__, and __eq__ from the attribute list.
  • You can construct instances with positional or keyword arguments (order follows the field order in the class).

Dataclass instances are mutable by default, so you often read and update attributes in place. When you want a new instance with some fields changed (without mutating the original), use dataclasses.replace. Delete usually means removing an object from a collection or dropping a reference—not “delete a column” like in a database.

from dataclasses import dataclass, replace
from datetime import date
@dataclass
class ServiceRecord:
vehicle_id: str
service_type: str
serviced_on: date
odometer_km: int
cost_usd: float
warranty_active: bool
# CREATE — construct a new instance
rec = ServiceRecord("VH-204", "Oil change", date(2026, 2, 15), 45200, 89.99, True)
# READ — attribute access
print(rec.vehicle_id, rec.serviced_on, rec.warranty_active)
# UPDATE (in place) — mutate attributes
rec.odometer_km = 46000
rec.warranty_active = False
# UPDATE (new instance) — original `rec` unchanged if you still hold a reference to the old object
rec_v2 = replace(rec, serviced_on=date(2026, 3, 1), cost_usd=120.50)
# DELETE — typically remove from a list or stop referencing the object
history = [rec, rec_v2]
history = [r for r in history if r.vehicle_id != "VH-204"] # drop matching records
# Or: del history[0] # remove by index

Summary

OperationTypical approach
CreateCall the class like a constructor: ServiceRecord(...)
ReadUse dot access: rec.odometer_km
UpdateAssign: rec.odometer_km = …, or replace(rec, field=value) for a copy with changes
DeleteRemove from a list/dict, or del the variable; use @dataclass(frozen=True) if you want to forbid attribute assignment

For optional fields, use Optional[...] or T | None (3.10+) and defaults, e.g. notes: str | None = None.

With a normal class you write __init__, __repr__, and __eq__ by hand. A dataclass generates all of these for you:

Manual class — you implement the constructor (how to create an instance), string representation (how it prints), and equality (when two instances are considered equal) yourself:

from datetime import date
class ServiceRecord:
def __init__(
self,
vehicle_id: str,
service_type: str,
serviced_on: date,
odometer_km: int,
cost_usd: float,
warranty_active: bool,
):
self.vehicle_id = vehicle_id
self.service_type = service_type
self.serviced_on = serviced_on
self.odometer_km = odometer_km
self.cost_usd = cost_usd
self.warranty_active = warranty_active
def __repr__(self):
return (
f"ServiceRecord(vehicle_id={self.vehicle_id!r}, service_type={self.service_type!r}, "
f"serviced_on={self.serviced_on!r}, odometer_km={self.odometer_km}, "
f"cost_usd={self.cost_usd}, warranty_active={self.warranty_active})"
)
def __eq__(self, other):
if not isinstance(other, ServiceRecord):
return NotImplemented
return (
self.vehicle_id == other.vehicle_id
and self.service_type == other.service_type
and self.serviced_on == other.serviced_on
and self.odometer_km == other.odometer_km
and self.cost_usd == other.cost_usd
and self.warranty_active == other.warranty_active
)

Dataclass — same behavior from a short attribute list:

@dataclass
class ServiceRecord:
vehicle_id: str
service_type: str
serviced_on: date
odometer_km: int
cost_usd: float
warranty_active: bool

Same idea; less code and clearer types with a dataclass.

  • Dataclasses are mutable by default, support default values and type hints naturally, can have methods, and are a normal class (so IDEs and type checkers understand them).
  • namedtuple is immutable and has a very light syntax, but no attribute defaults in the same way and less tooling support.

Use dataclasses when you want a small data container with optional defaults and methods.

Good for: config objects, parsing results, small data transfer objects (DTOs) — any place you’d otherwise write a class that’s mostly “data + maybe a method or two.” For a longer example, see Process cloud policies, which uses dataclasses for policy statements and documents.