Wondering how to add TONS of data


#1

hello! the app i’m building is a collection of minecraft recipes. i’ve currently got all the data sitting in a JSON file (objects nested within objects). is there a simple way to get that into the SQLite database? googling hasn’t helped - all the answers on stackoverflow are way over my head.

also, i’d like to have another “collection” within the recipes collection. the nested “collection” is for minecraft items (this data is also sitting in a JSON file). some items are strictly items, and some are ingredients for the recipes. can i do this and have the items connected to the recipes? this seems way over my head too…

thanks for any help! let me know if my questions aren’t clear.


#2

So it looks like Django has this built-in. I just researched this since I’m interested in the same topic. It would be great to load some test data to the local server after purging it. It would also be great to be able to load some data after pushing to production.

First you might want to take a look at this article from the Django website:
https://docs.djangoproject.com/en/1.8/howto/initial-data/

Since your data is already in JSON, this should be pretty easy to see how to fit it into your mode.

Here is what I did:

Create a new directory “fixtures” under "collection"
Created a new file under fixtures: load_initial_data.json

[
	{
		"model": "collection.thing",
		"pk": 1,
		"fields": {
            "name": "Thing 1",
            "description": "The description of thing 1",
            "slug": "thing-1"
            }
	},
	{
		"model": "collection.thing",
		"pk": 2,
		"fields": {
            "name": "Thing 2",
            "description": "The description of thing 2",
            "slug": "thing-2"
            }
	}
]

*Note, before you proceed, the PK attribute is your database ID, so it will overwrite your data here. Make sure you aren’t overwriting anything important!!
Run: python manage.py loaddata load_initial_data.json
You should see the following message:
Installed 2 object(s) from 1 fixture(s)

Notice that the model is “collection.thing”, this is because “collection” is the app label in this case.

You will need to make your JSON data compatible with the loaddata format. So to do that you will likely need to modify the file. I’d recommend writing a separate Python script to do this programmatically outside of your app. You might need to send an example of what the JSON looks like.


#3

i’m super not a programmer so this is definitely over my head. i understand what the command python manage.py loaddata whatever.json does, but not at all how to make it work for me.

i need to add a reference to my model definition for every single piece of JSON data?

database ID? and it increments? that doesn’t make any sense - there’s only one database.

i’m following Hello Web App verbatim (other than thing = recipe and things = recipes) so i have no knowledge of an ID for anything database related.

here’s a very small sample of the recipe JSON file:

[
{
	"data_value":382.0,
	"name":"Glistering Melon",
	"text_type":"speckled_melon",
	"ingredients":[
		{
			"data_value":371.0,
			"name":"Gold Nugget",
			"text_type":"gold_nugget",
			"quantity":8
		},
		{
			"data_value":360.0,
			"name":"Melon",
			"text_type":"melon",
			"quantity":1
		}
	],
	"quantity":1,
	"notes":"Glistering Melon is a brewing ingredient. The ingredients must be placed exactly as shown - they make a relevant shape in the crafting table.",
	"category":"Brewing"
},
{
	"data_value":378.0,
	"name":"Magma Cream",
	"text_type":"magma_cream",
	"ingredients":[
		{
			"data_value":377.0,
			"name":"Blaze Powder",
			"text_type":"blaze_powder",
			"quantity":1
		},
		{
			"data_value":341.0,
			"name":"Slimeball",
			"text_type":"slime_ball",
			"quantity":1
		}
	],
	"quantity":1,
	"notes":"Magma Cream is a brewing ingredient. It can be crafted and it can be dropped by Magma Cubes (25% chance). This recipe does not require a shape - ingredients can be placed in any slot in the crafting table.",
	"category":"Brewing"
},
{
	... etc
}
]

#4

Yes (AFAIK — I’m going to ask some other people to confirm). Basically, every piece of your data needs a “home” in your model. You can go through https://docs.djangoproject.com/en/1.8/ref/models/fields/ and see what field type fits for the pieces you’re adding. ‘notes’ is just like “description” in the Hello Web App model example.

It gets a bit more complicated when you get to “ingredients” because that will probably want to be its own model, and I didn’t cover linking together multiple models in HWA1 (I’m working on a second book that’ll cover this though and I’ve written the chapter on this — email me and I can send you the draft and maybe that’ll help!)

Actually “id” is the id of the “row” in the database. Every piece of information in your database gets an id (a “pk”, primary key). So in @TheGRS’s example, he meant that if you have an existing database, if you add something from a JSON file with an ID/PK of 1, it’ll overwrite the object in the existing database that has an ID/PK of 1. So you can lose data in your database if you do that. But if you’re working with a new project and just playing around with databases, feel free to accidentally overwrite stuff because you can delete the database and start over (find the database file, delete it and the folder for migrations, then create the database again using the steps in the models chapter in HWA.)

So, all that said — you might be jumping a tiny bit ahead with your JSON file, esp. since it’s a tiny bit more complicated than the HWA example. What I would do is to copy that file, and then remove any complicated data and try doing the steps to load a database with the simplified data. See if you can get it to work with a simpler file. Once you’ve done that and feel comfortable, you can try making the model more complicated, delete your local database, and try loading the more-complicated JSON data. Basically, try starting smaller to start just so you learn.

I hope this helps! :D


#5

Everything Tracy said is very true and it’s definitely the best course of action for a beginner.

Once you’re comfortable putting things into the database, there is another thing you can look into: You could store some JSON as a field in a model. Or any other kind of encoded format, like Markdown.

Consider, for example, a blog post model. The most common implementation of this, you’d have a blog post title, the author, the date, and the body of the post. The body is typically saved with somekind of formatting like Markdown (similar to how this forum’s formatting works). When we fetch the blog post, we use a library to parse the Markdown into HTML and render it. We could store each header and paragraph as separate rows in the database rather than one String field with Markdown, but that wouldn’t buy us anything (and actually make things more complicated). We only ever need the full article, so may as well store it as one thing.

Now back to the JSON. If you only ever need the full JSON structure (such as if you’re using some JavaScript to do something with it later), then you could just save the whole thing as one column. The problem with this is that you won’t be able to easily query your database based on any of the fields in the JSON structure. Let’s say you wanted to find all recipes that use Slimeball. But if this isn’t something you’ll ever need to do, then that’s not a problem. :)

If you’re going to look into using a JSON field in your model, consider using a library like https://github.com/bradjasper/django-jsonfield which automatically encodes/decodes the JSON for you so you don’t have to worry about it.

Again, this is for later—once you’re looking for something more advanced. To get started though, I’d do what Tracy said. :)


#6

[quote=“limedaring, post:4, topic:79”][/quote]

Thank you for clarifying this process for me! I understand much better what’s going on when loading JSON data this way. I will hold off on the ingredients with my first attempts and get familiar with this process using a simplified JSON file as you suggest. I think I can do this :)

I love that you’re writing a second book and you can definitely count me as a supporter. I will email you if I need a peek at your model linking chapter. Slow and steady will probably be the best way, though ;)

[quote=“shazow, post:5, topic:79”][/quote]

Thanks also for explaining how JSON can be used directly. I probably won’t do this, but it helps me understand the relationship better.


#7

Woo! Keep us posted on your progress. :)


#8

So this is exciting and FYI in case anyone else wonders about doing this (might be mentioned later in the book but I haven’t gotten there yet.)

If you want to build a second set of Things, it’s really easy to just jump back and do it all another time. Start at Chapter 6, Adding Dynamic Data > Setting up your model (page 36 paperback edition) and finish at Chapter 10, Adding a Registration Page > Changing our model so Users can own a Thing (page 77 paperback edition)

Go through everything again including makemigrations / migrate and all the .py file and template file edits just making sure to use your new thing name. It’s easier the second time around because you’ve done it before, but also you can copy the thing_detail and edit_thing templates as well as the edit_thing view definition.

I haven’t yet done “Changing our model so Users can own a Thing” so I’m not sure if I’m about to break everything by having done this. Fingers crossed!


#9

As long as you’re working locally, don’t worry about breaking things! :D Yay for trying out new things and thanks for giving the details!