Data-TagDB
view release on metacpan or search on metacpan
lib/Data/TagDB/Tutorial/Conventions.pod view on Meta::CPAN
While each subject has only one type, in a real world database a tag might have several C<has-type> relations.
However this is only allowed if those types are actually in inherence. As the data might not be complete this
might be hard to check.
Therefore, it might be wide to allow multiple C<has-type> alike C<also-has-role> however print a warning.
See also
L<Data::TagDB::Tutorial::WellKnown/also-has-role>,
L<Data::TagDB::Tutorial::WellKnown/has-type>.
=head2 Type inherence
To implement inherence in types the relations C<specialises> and C<generalises> are used.
It is very common to have multi level type trees.
For newly defined types it is also wise to specialise them from common well known types.
This allows software that is not aware of them to perform basic operations
(such as to correctly display them to the user).
See also
L<Data::TagDB::Tutorial::WellKnown/specialises>,
L<Data::TagDB::Tutorial::WellKnown/generalises>.
=head2 Common types
The following types are very common. Many other complex types specialise them.
See also
L<Data::TagDB::Tutorial::WellKnown>.
=head3 Types and roles
Types and roles inherit from C<subject-type>.
See also
L<Data::TagDB::Tutorial::WellKnown/subject-type>.
=head3 Entities, Persons, accounts
All accounts (like e-mail, bank, or user accounts) should have roles including
C<account> (C<b72508ba-7fb9-42ae-b4cf-b850b53a16c2>).
Accounts are owned by one or more entities.
They commonly include some amount of profile data
(such as names and contact information) and may represent the entity they are owned by.
In contrast an entity (natural or legal) should have roles including
C<entity> (C<09ade47e-b049-436b-bf10-8357f4b6bc05>).
Entities also include some amount of profile data. However this should be limited
to such data that is directly linked to the entity and is valid indefinitely
(e.g. names, cultural background, locations, and important events).
An entity never contain any technical data such as login credentials.
Such data is a clear sign that it is in fact an account.
Legal entities are entities that are created by legal means. Most commonly
companies. In order for them to be legal entities registration and legal
documentation is needed.
Such entities should include the role C<legal-entity> (C<f57f5e00-1d08-4731-b49b-c8316e23f06a>).
If it is unclear if a group qualifies as a legal entity (e.g. people doing something together vs.
a registered association) it is wise to only mark it as C<entity>.
Natural entities are I<living beings>.
Those include alive, dead, real, fictional, human and non-human beings.
Such subjects should have the role
C<natural-entity> AKA C<person> (C<f6249973-59a9-47e2-8314-f7cf9a5f77bf>) included in their list of roles.
In addition to the C<natural-entity> the universal tag model includes the type
C<body> (C<5501e545-f39a-4d62-9f65-792af6b0ccba>) used to record all what is related to the body of a person,
such as birthday, locations, and species.
And C<character> (C<a331f2c5-20e5-4aa2-b277-8e63fd03438d>) used to record anything about the character of a person,
such as identity, world view, and interests.
B<Note:>
When in doubt it is wise to use the role
C<proto-entity> (C<7be4d8c7-6a75-44cc-94f7-c87433307b26>).
It is the super-role for all other entity and account roles and provides many common properties.
=head3 Files
There are four different things people commonly refer to as files:
I<hardlinks>, I<inodes>, I<bit exact copies>, and I<creative works>.
We'll conver each in their own section. Here is a basic overview to find out
what section is the correct one:
=over
=item I<Hardlinks>
If you consider two files the same if they have the same filename
(that is on the same machine and the B<full> path matches) even if the content,
timestamps, and file attributes change you most likely mean I<hardlinks>.
=item I<Inodes>
If you consider two files the same if refer to the same physical storage (the same blocks on the disk),
indepdenent on the filename (and path), and permit for the content, timestamps, and file attributes to
change but consider each copy (that does not share blocks and therefore can change indepndently) distict
(this also implies that any copy on a different disk is a distinct) you likely mean I<inodes>.
=item I<Bit exact copies>
If you consider any copy that has exactly (bit-wise) the same content, independent on it's storage the same
then you most likely mean I<bit exact copies> (sometimes also called I<states>).
This also implies that if two files have the same hash or checksum they are likely the same, but if they have
different hashes or checksums they are always distinct.
=item I<Creative works>
If you consider all copies the same that have the same human readable content, independent on storage location
and file format, encoding, or any other property you likely mean a I<creative work>.
B<Note:>
While creative works are one of the fundamental types they are strictly speaking not files by the definition of
a file. Therefore they are covered in their own section under L</Creative works>.
=back
=head4 Hardlinks
Hardlinks are very volatile as they often change over time. Therefore it is hard to include them in a non-volatile
data model. It is best to consider if the data one wants to store is better bound to another propery than the filename
( run in 3.441 seconds using v1.01-cache-2.11-cpan-39bf76dae61 )