Age | Commit message (Collapse) | Author | Files | Lines |
|
This saves about 1GiB.
|
|
https://vndb.org/t950.317
|
|
|
|
|
|
Previously the website was connected to the database with a "database
owner" user, which has far too many permissions. Now there's a special
vndb_site user with only the necessary permissions. The primary
reason to do this is to decrease the impact if the site process is
compromised. E.g. it's now no longer possible to delete or modify old
entry revisions. An attacker can still do a lot of damage, however.
Additionally (and this was the main reason to implement this change in
the first place), the user sessions, passwords and email data is now not
easily accessible anymore. Hopefully, the new user management
abstractions will prevent email and password dumps in case of an SQL
injection or RCE vulnerability in the site code. Of course, this only
works if my implementation is fully correct and there's no privilige
escalation vulnerability somewhere.
Furthermore, changing your password now invalidates any existing
sessions, and the password reset function is disabled for 'usermods'
(because usermods can list email addresses from the database, and the
password reset function could still allow an attacker to gain access to
anyone's account).
I also changed the format of the password reset tokens, as they totally
don't need to be salted.
|
|
|
|
VNDB tends to get unresponsive for a few minutes when the daily cron is
run. This should help somewhat.
|
|
|
|
This is a generalization of the search improvements made in
7da2edeaa0f6cf7794f4f8f68960497dc1be893c and
92235222dba4e5d0c7713d53ef12e0f10e371b83
And has been applied to the dropdown searches for producers, staff, tags
and traits.
For all those searches, exact matches are listed first, followed by
prefix matches, and then substring matches. Relevance is currently only
based on the primary name/title and ignores aliases (except for staff).
This is fixable, but not trivial, and I'm not sure it's all that useful.
|
|
|
|
|
|
|
|
|
|
...unless I missed something.
|
|
This has been mostly automated.
|
|
I definitely needed the Tie::IxHash thing for these.
|
|
This removes the reliance on sort() to provide meaningful ordering (the
keys aren't always good for ordering) and removes the 'order' hack used
for (vn|prod)_relations.
|
|
|
|
|
|
TODO: Intern strings again to simplify the code.
The immediate effect of this commit is that starting the util/vndb.pl
script and generating the JS file is much faster now and that vndb.pl
uses less memory. Translations have already been disabled on the main
VNDB for a week now.
|
|
|
|
Compresses a little better. I reduced the number of iterations required
to find the optimal image size in spritegen.pl, but generating the
icons.png is *incredibly slow* when combining zopflipng with the 'slow'
option. It's possible to parallelize the calculation and use multiple
cores to speed it up, but that seems overkill.
Some icons.png compression stats:
METHOD SIZE RUNTIME
default 18103 <1sec
slow 17941 few secs
pngcrush 15385 <1sec
pngcrush+slow 15148 few mins
zopflipng 14986 few secs
zopflipng+slow 14898 ~1 hour
|
|
- Merged polls table into threads table. Not much of a
storage/performance difference, and it's a bit simpler this way.
- Merged DB::Polls into DB::Discussions. Mainly because of the above
change in DB structure.
- Add option to remove an existing poll.
- Allow preview and recast to be changed without deleting the votes
- Set preview option by default. Because personal preferences. :)
- Minor form validation differences
|
|
|
|
I'd have preferred to stick with XHTML 1.0, but unfortunately browsers
won't allow you to use modern Javascript APIs with an older doctype.
Note that most pages don't actually validate correctly as HTML5, I'm
relying on browsers to be lenient.
In either case, I'd like VNDB to stay valid XML (XHTML5, then), and
luckily that shouldn't be a problem.
|
|
They had to be deleted from the database at some point, otherwise we
still have thousands of easily-cracked password hashes in the database.
Note that I could have opted to use scrypt on top of the sha256 hashes
so the passwords would remain secure without needing to reset
everything, but doing that after one year of switching to scrypt is
likely not worth it. Everyone who still actively uses his account has
already been converted to scrypt, everyone else should just reset their
password whevener they decide to come back.
|
|
|
|
The new database schema doesn't allow an alias to be removed when it is
still linked to a VN.
|
|
These indices provide a significant speed-up of /v+ and /u+ pages, and
improve some other stuff as well.
|
|
An index on threads_posts.date was necessary to speed up some very
common "recent posts" queries on both the homepage and the thread index.
Postgres thought that the same index could be used to speed up the
full-text search (because it's ordered by date, after all), but that
completely killed performance. That was solved with a bb_tsvector()
wrapper to tell the query planner that not using the full-text index is
incredibly show, which in turn improved the search performance beyond
what it was.
Many thread-related queries are still somewhat slow, but that seems to
be a limitation in the schema. I'll just keep monitoring to see if
that's worth fixing in the future.
Interestingly, dbThreadCount() needs to use a sequential scan, but it's
still remarkably fast.
|
|
This changes quite a bit to the way the editing functions work. Because
these functions are very repetitive and it's easy to keep things out of
sync, I created a script to generate them automatically. I had to rename
a few function and table names for consistency to make this work.
Since database entries don't have a 'latest' column anymore, and since
the order in which tables are updated doesn't have to be fixed, I
dropped many of the SQL triggers and replaced them with a
edit_committed() function which is called from edit_*_commit() and
checks for stuff to be done.
Don't forget to run 'make' before importing the update script.
|
|
|
|
This basically makes VNDB browsable again, but editing entries is still
broken.
I split off the get-old-revision functionality from the db*Get() methods
into db*GetRev(). This split makes sense even with the old SQL schema:
db*Get() had to special-case some joins/filters when fetching an older
revision, and none of the other filters would work in that case. This
split does cause some code duplication in that all db*GetRev() methods
look very much alike, and that the columns they fetch is almost
identical to the db*Get() methods. Not sure yet how to avoid the
duplication elegantly.
I didn't do a whole lot of query optimization yet (most issues require
extra indices, I'll investigate later which indices will make a big
difference), but I did fix some low hanging fruit whenever I encountered
something.
I don't think I've worsened anything, performance-wise.
|
|
This commit breaks pretty much everything. Lots of code will have to be
fixed to work with this new schema.
The basic idea is to separate live data from archived data, which allows
for smaller and more effective indices on the live data, and the
archived data doesn't need such indices and have to be accessed at all
for most operations. Another goal is to eliminate table joins to fetch
some necessary information, e.g. it's not necessary anymore to join the
main item tables in order to fetch only the latest revision of some item
data.
This is very much work in progress. I might stumble upon some weird
issue while fixing the code, and might have to redesign everything
from scratch again. Let's just see how things go.
|
|
Turns out that fetching whether or not you have unread notifications
(done on every pageview if you're logged in) was pretty slow. The index
speeds up both that query and the "my notifications" view.
The extra purge for old notifications for users with more than 500
notifications ensures that the index stays effective for the unread
notifications count. Otherwise it'll have to read half of the
notifications table anyway to check the 'unread' filter.
|
|
Turning the foreign key references into idempotent statements required
adding the name for each reference in the query. I used the names of the
production database, but since the names are autogenerated at creation
time, it's possible that they have other names if the database has been
created slightly differently.
Using explicit names for everything and having idempotent SQL statements
is rather useful when making nontrivial modifications to the database
schema. Which is something I consider doing.
|
|
Same reasoning as 19ce5fcf536ed478ad34b6b1014bf6f44841d25d
|
|
Adds slightly more strict validation and simplifies further processing.
|
|
No more need for extra json_encode/json_decode calls, and the
form_compare() function is more lenient w.r.t. integer/string
comparison.
This is the improvement I described in commit
ed86cfd12b0bed7352e2be525b8e63cb4d6d5448
|
|
As suggested by https://vndb.org/t2520.168
|
|
Looks like 0 is actually used often to indicate some special value.
Affects basically all 'check all' boxes (had to modify some of those
boxes because some used -1, but that wasn't a problem).
|
|
|
|
This is less convenient than I had expected, because all the form
handling code is designed to work with plain strings rather than any
scalar. This means the json data has to be encoded again to get into
$frm (not doing this means that, if the form didn't validate, the field
won't be filled out correctly). And then decoded for validation, and
then encoded again for comparison.
I suspect the better solution is to fix the form handling code to handle
arbitrary data structures: comparison can be done by deep comparison
rather than a simple string compare, and the form generator can
auto-encode-to-json if it sees a complex object.
Another advantage of this solution is that the comparison function can
be less strict with respect to number formatting. In the current scheme
you have to be very careful that numbers are not automatically coerced
into string format, otherwise the comparison will fail.
Either way, that's an idea for the future...
|
|
|
|
And added new 'page' and 'id' templates for more strict validation.
|
|
And also fix strip_bb_tags() to be case-insensitive and fix a bug in
converting the query into a tsquery.
|
|
Inspired by wakaranai's implementation at
https://github.com/morkt/vndb/commit/b852c87ad145fdaaa09c79b6378dd819b46f7e87
This version is different in a number of aspects:
- Separate search functions for title search and fulltext post search.
Perhaps not the most convenient option, but the downside of a combined
search is that if the query matches the threads' title, then all of
the posts in that thread will show up in the results. This didn't seem
very useful.
- Sorting is based purely on post date. Rank-based sort is slow without
a separate caching column, and in my opinion not all that useful.
Implementation differences:
- Integrated in the existing DB::Discussions functions, so less code to
maintain and more code reuse.
- No separate caching column for the tsvector, a functional index is
used instead. This is a bit slower (index results need to be
re-checked against the actual messages, hence the slowdown), but has
the advantage of smaller database dumps and less complexity in
updating the cache.
Things to fix or look at:
- Highlighting of the search query in message contents.
- Allow or-style query matching
|
|
The char(2) solution is both inefficient and ugly. Also needed to be
careful with handling the extra space that Postgres would automatically
add to single-character types.
|
|
A recent version of imagemagick creates 16 bit depth PNG images by
default for some reason. This results in an unnecessarily large file
size increase and pngcrush doesn't do much to counter it (and its
-bit_depth option has been deprecated, too).
The atomic replace is quite handy to avoid people seeing any wierd
intermediate images while the slow+pngcrush options are being used.
|
|
Tends to compress a bit better than JavaScript::Minifier::JS. But is
also a lot slower, so not really useful when devving.
Stats for en.js:
raw gzip
uglifyjs 68199 19446
JS::Minifier::XS 79862 21624
Uncompressed 107662 28663
On an unrelated note, I like how jQuery boasts about being "Only 32kB
minified and gzipped.". That's quite a bit more than all of VNDB's
Javascript combined. For a damn library.
|