Looking to build a database, no idea where to begin

Hi /r/learnprogramming. Let me start by saying that I'm a mathematics grad with no previous programming experience (except MATLAB but does that really count?) but with good understanding of logic and algorithms.

I have an idea of a program that I want to build for personal usage. I also want to learn Python, so was thinking that I could use this as a project to do while learning, but would appreciate advice if Python is unsuitable here, or if another language would make things much easier.

I want to build a kind of database/filing system with dynamically updating 'tags'. Let me describe an example of typical usage I'm after:

I might come across an interesting news article about a deal between Company X and Company Y who both work in Industry Z. I can download this article as a pdf, but then where do I file it – in the X, Y or Z folder? Instead, I want to manage a central list of 'tags' that indicate a link between these objects. I'm thinking that I can create a sort of 'category' for companies (and place X and Y on this list) as well as one for industries (and place Z here) then I can give the pdf file a reference number which is tagged with X, Y and Z. Then I can navigate back to the news article through any of the 3 relevant alleys.

I will then be able to do all kinds of interesting visualisations, like seeing which elements of the companies category are closely linked, or seeing a past series of articles attached to Company X, or many other things.

I also want to be able to add 'multi-layered tags'. For example, I might find a theoretical physics paper which I want to download. I could then tag this with the topics 'Physics' and 'Maths', but inside here I might also want to tag it with 'Quantum Mechanics' inside the 'Physics' tag and 'Probability' inside the 'Maths' tag, so that in effect I would managing an architecture that had

Database:{Physics: {Quantum Mechanics: {my article}}, Maths: {Probability: {my article (again)}}}.

Any advice at getting started would be much appreciated as I'm pretty out of my depth having never coded anything substantial before! Or if anyone could point me at an existing framework that I might be able to use, that would also be great. Cheers!

PS. Happy to answer any questions or clarify anything from the above.

My current idea is to have each 'object' (be it a Company, an Industry, 'Maths', a specific article…) be associated with a text file with: a name, a reference number, a hierarchy of tags placed on numbered levels. Then each time I created a new such object, it would give me all the existing tags which I could fill in if necessary, or I could add new ones which would then be pushed back onto a master list of tags.

For example, with the physics paper example, I could have:

Name: QM paper 1

Ref_num: 1

Level 1 tags: 'Physics', 'Maths'

Level 0 tags: 'Quantum Mechanics' (inside Level 1 'Physics'), 'Probability' (inside Level 1 'Maths'),

filepath: C:…\qmpaper1.pdf

and then we would also have

Name: Physics

Ref_num: 2

Level 0 tags: 'Quantum Mechanics'

although now I suppose I have some redundant labelling (why bother label the paper with physics if QM is already inside physics?)…

