path: root/util
AgeCommit message (Collapse)AuthorFilesLines
2016-11-27SQL: Use separate role for Multi2.26Yorhel2-2/+69
2016-11-27SQL: Use separate role for the website + disallow access to user dataYorhel5-4/+206
Previously the website was connected to the database with a "database owner" user, which has far too many permissions. Now there's a special vndb_site user with only the necessary permissions. The primary reason to do this is to decrease the impact if the site process is compromised. E.g. it's now no longer possible to delete or modify old entry revisions. An attacker can still do a lot of damage, however. Additionally (and this was the main reason to implement this change in the first place), the user sessions, passwords and email data is now not easily accessible anymore. Hopefully, the new user management abstractions will prevent email and password dumps in case of an SQL injection or RCE vulnerability in the site code. Of course, this only works if my implementation is fully correct and there's no privilige escalation vulnerability somewhere. Furthermore, changing your password now invalidates any existing sessions, and the password reset function is disabled for 'usermods' (because usermods can list email addresses from the database, and the password reset function could still allow an attacker to gain access to anyone's account). I also changed the format of the password reset tokens, as they totally don't need to be salted.
2016-08-09Add Croatian languageYorhel2-1/+5
2016-07-31SQL: Improve trait cache update from 206 to 16 secondsYorhel3-4/+12
VNDB tends to get unresponsive for a few minutes when the daily cron is run. This should help somewhat.
2016-07-31Add Thai languageYorhel2-1/+5
2016-07-03Generalize substring search relevance + apply to most dropdown searchesYorhel2-0/+26
This is a generalization of the search improvements made in 7da2edeaa0f6cf7794f4f8f68960497dc1be893c and 92235222dba4e5d0c7713d53ef12e0f10e371b83 And has been applied to the dropdown searches for producers, staff, tags and traits. For all those searches, exact matches are listed first, followed by prefix matches, and then substring matches. Relevance is currently only based on the primary name/title and ignores aliases (except for staff). This is fixable, but not trivial, and I'm not sure it's all that useful.
2016-07-02Util::ValidateTemplates: Fix forgotten import of kv_validateYorhel1-1/+1
2016-07-02Validate release dates + move validation out of vndb.plYorhel1-75/+0
2016-02-15Add Tagalog languageYorhel2-1/+5
2016-02-12JS: Fix char_roles bug + CSS: Minor tweaks to main VN info layoutYorhel1-1/+1
2016-01-23L10N: Remove all remaining traces of the interface translation featureYorhel2-220/+0
...unless I missed something.
2016-01-17L10N: Intern all Javascript strings and rename main JS fileYorhel1-100/+4
This has been mostly automated.
2016-01-17L10N: Intern blood_types/genders/(char|staff)_roles/discussion_boardsYorhel1-4/+4
I definitely needed the Tie::IxHash thing for these.
2016-01-17Use Tie::IxHash for some listsYorhel1-3/+3
This removes the reliance on sort() to provide meaningful ordering (the keys aren't always good for ordering) and removes the 'order' hack used for (vn|prod)_relations.
2016-01-17L10N: Intern tag_cats/voiced/animated/*_statusYorhel1-3/+3
2016-01-16L10N: Intern languages/platforms/resolutions/media/ptype/rtype/vnlengthYorhel1-11/+8
2016-01-16L10N: Remove all translationsYorhel1-30/+2
TODO: Intern strings again to simplify the code. The immediate effect of this commit is that starting the util/ script and generating the JS file is much faster now and that uses less memory. Translations have already been disabled on the main VNDB for a week now.
2016-01-10Use atomic replace when writing .gz assetsYorhel2-2/+8
2016-01-10Support zopfli/zopflipng for all static asset generatorsYorhel3-22/+25
Compresses a little better. I reduced the number of iterations required to find the optimal image size in, but generating the icons.png is *incredibly slow* when combining zopflipng with the 'slow' option. It's possible to parallelize the calculation and use multiple cores to speed it up, but that seems overkill. Some icons.png compression stats: METHOD SIZE RUNTIME default 18103 <1sec slow 17941 few secs pngcrush 15385 <1sec pngcrush+slow 15148 few mins zopflipng 14986 few secs zopflipng+slow 14898 ~1 hour
2015-11-11Misc poll improvementsYorhel3-25/+46
- Merged polls table into threads table. Not much of a storage/performance difference, and it's a bit simpler this way. - Merged DB::Polls into DB::Discussions. Mainly because of the above change in DB structure. - Add option to remove an existing poll. - Allow preview and recast to be changed without deleting the votes - Set preview option by default. Because personal preferences. :) - Minor form validation differences
2015-11-10Merge branch 'master' into pollmorkt8-817/+1096
2015-11-01Switch to HTML5 doctype + s/acronym/abbr/ + s/&nbsp;/&#xa0;/eYorhel1-0/+4
I'd have preferred to stick with XHTML 1.0, but unfortunately browsers won't allow you to use modern Javascript APIs with an older doctype. Note that most pages don't actually validate correctly as HTML5, I'm relying on browsers to be lenient. In either case, I'd like VNDB to stay valid XML (XHTML5, then), and luckily that shouldn't be a problem.
2015-11-01Removed support for sha256-hashed passwordsYorhel2-3/+3
They had to be deleted from the database at some point, otherwise we still have thousands of easily-cracked password hashes in the database. Note that I could have opted to use scrypt on top of the sha256 hashes so the passwords would remain secure without needing to reset everything, but doing that after one year of switching to scrypt is likely not worth it. Everyone who still actively uses his account has already been converted to scrypt, everyone else should just reset their password whevener they decide to come back.
2015-11-01Remove deprecated 'staffedit' permission flagYorhel1-0/+2
2015-10-25Staff: Add error msg when removing used alias + fix bug in alias editingYorhel1-2/+2
The new database schema doesn't allow an alias to be removed when it is still linked to a VN.
2015-10-24SQL: Throwing around some indicesYorhel1-2/+8
These indices provide a significant speed-up of /v+ and /u+ pages, and improve some other stuff as well.
2015-10-24Improve several discussion board SQL queriesYorhel3-2/+13
An index on was necessary to speed up some very common "recent posts" queries on both the homepage and the thread index. Postgres thought that the same index could be used to speed up the full-text search (because it's ordered by date, after all), but that completely killed performance. That was solved with a bb_tsvector() wrapper to tell the query planner that not using the full-text index is incredibly show, which in turn improved the search performance beyond what it was. Many thread-related queries are still somewhat slow, but that seems to be a limitation in the schema. I'll just keep monitoring to see if that's worth fixing in the future. Interestingly, dbThreadCount() needs to use a sequential scan, but it's still remarkably fast.
2015-10-21SQL: Fix editing + func.sql + triggers.sql + autocreate editing funcsYorhel7-585/+406
This changes quite a bit to the way the editing functions work. Because these functions are very repetitive and it's easy to keep things out of sync, I created a script to generate them automatically. I had to rename a few function and table names for consistency to make this work. Since database entries don't have a 'latest' column anymore, and since the order in which tables are updated doesn't have to be fixed, I dropped many of the SQL triggers and replaced them with a edit_committed() function which is called from edit_*_commit() and checks for stuff to be done. Don't forget to run 'make' before importing the update script.
2015-10-18discussion board polls.morkt1-0/+24
2015-10-17SQL: Fix all browsing queries to use the new schemaYorhel1-0/+1
This basically makes VNDB browsable again, but editing entries is still broken. I split off the get-old-revision functionality from the db*Get() methods into db*GetRev(). This split makes sense even with the old SQL schema: db*Get() had to special-case some joins/filters when fetching an older revision, and none of the other filters would work in that case. This split does cause some code duplication in that all db*GetRev() methods look very much alike, and that the columns they fetch is almost identical to the db*Get() methods. Not sure yet how to avoid the duplication elegantly. I didn't do a whole lot of query optimization yet (most issues require extra indices, I'll investigate later which indices will make a big difference), but I did fix some low hanging fruit whenever I encountered something. I don't think I've worsened anything, performance-wise.
2015-10-17SQL: Convert all item-tables to a different schemaYorhel3-235/+669
This commit breaks pretty much everything. Lots of code will have to be fixed to work with this new schema. The basic idea is to separate live data from archived data, which allows for smaller and more effective indices on the live data, and the archived data doesn't need such indices and have to be accessed at all for most operations. Another goal is to eliminate table joins to fetch some necessary information, e.g. it's not necessary anymore to join the main item tables in order to fetch only the latest revision of some item data. This is very much work in progress. I might stumble upon some weird issue while fixing the code, and might have to redesign everything from scratch again. Let's just see how things go.
2015-10-12Notifications: Allow max 500 notifies per user + add SQL index on uidYorhel2-0/+5
Turns out that fetching whether or not you have unread notifications (done on every pageview if you're logged in) was pretty slow. The index speeds up both that query and the "my notifications" view. The extra purge for old notifications for users with more than 500 notifications ensures that the index stays effective for the unread notifications count. Otherwise it'll have to read half of the notifications table anyway to check the 'unread' filter.
2015-10-12SQL: Split constraints/indices/triggers in new file + use idempotent SQLYorhel5-179/+172
Turning the foreign key references into idempotent statements required adding the name for each reference in the query. I used the names of the production database, but since the names are autogenerated at creation time, it's possible that they have other names if the database has been created slightly differently. Using explicit names for everything and having idempotent SQL statements is rather useful when making nontrivial modifications to the database schema. Which is something I consider doing.
2015-10-11SQL: Convert producers_rev.type into enumYorhel3-1/+9
Same reasoning as 19ce5fcf536ed478ad34b6b1014bf6f44841d25d
2015-10-03formValidate: Add json_(maxitems|unique|sort) options to json templateYorhel1-2/+46
Adds slightly more strict validation and simplifies further processing.
2015-10-03Handle JSON data natively when processing form dataYorhel1-3/+2
No more need for extra json_encode/json_decode calls, and the form_compare() function is more lenient w.r.t. integer/string comparison. This is the improvement I described in commit ed86cfd12b0bed7352e2be525b8e63cb4d6d5448
2015-10-01VN search: Add some more quote characters + & to normalizationYorhel1-0/+4
As suggested by
2015-09-20formValidate(): Let's just allow a '0' id - fix more errorsYorhel1-1/+1
Looks like 0 is actually used often to indicate some special value. Affects basically all 'check all' boxes (had to modify some of those boxes because some used -1, but that wasn't a problem).
2015-09-20Fix handling of empty seiyuu/credits fieldsYorhel1-1/+1
2015-09-20formValidate: Add json template and remove json_validate() functionYorhel1-1/+24
This is less convenient than I had expected, because all the form handling code is designed to work with plain strings rather than any scalar. This means the json data has to be encoded again to get into $frm (not doing this means that, if the form didn't validate, the field won't be filled out correctly). And then decoded for validation, and then encoded again for comparison. I suspect the better solution is to fix the form handling code to handle arbitrary data structures: comparison can be done by deep comparison rather than a simple string compare, and the form generator can auto-encode-to-json if it sees a complex object. Another advantage of this solution is that the comparison function can be less strict with respect to number formatting. In the current scheme you have to be very careful that numbers are not automatically coerced into string format, otherwise the comparison will fail. Either way, that's an idea for the future...
2015-09-20formValidate: Created templates for gtin and editsum fieldsYorhel1-0/+3
2015-09-20Update usage kv_validate() to upcoming TUWF 1.0Yorhel1-6/+3
And added new 'page' and 'id' templates for more strict validation.
2015-09-07Handler::Discussions: Use ts_headline() to format search resultsYorhel2-2/+14
And also fix strip_bb_tags() to be case-insensitive and fix a bug in converting the query into a tsquery.
2015-09-07Implement discussion board search functionYorhel3-0/+18
Inspired by wakaranai's implementation at This version is different in a number of aspects: - Separate search functions for title search and fulltext post search. Perhaps not the most convenient option, but the downside of a combined search is that if the query matches the threads' title, then all of the posts in that thread will show up in the results. This didn't seem very useful. - Sorting is based purely on post date. Rank-based sort is slow without a separate caching column, and in my opinion not all that useful. Implementation differences: - Integrated in the existing DB::Discussions functions, so less code to maintain and more code reuse. - No separate caching column for the tsvector, a functional index is used instead. This is a bit slower (index results need to be re-checked against the actual messages, hence the slowdown), but has the advantage of smaller database dumps and less complexity in updating the cache. Things to fix or look at: - Highlighting of the search query in message contents. - Allow or-style query matching
2015-09-06SQL: Convert threads_board.type to ENUMYorhel3-1/+7
The char(2) solution is both inefficient and ugly. Also needed to be careful with handling the extra space that Postgres would automatically add to single-character types. Add pngcrush/slow options + force png32 + atomic replaceYorhel1-6/+13
A recent version of imagemagick creates 16 bit depth PNG images by default for some reason. This results in an unnecessarily large file size increase and pngcrush doesn't do much to counter it (and its -bit_depth option has been deprecated, too). The atomic replace is quite handy to avoid people seeing any wierd intermediate images while the slow+pngcrush options are being used.
2015-08-17jsgen: Support external command for JS compression (like uglifyjs)Yorhel1-7/+30
Tends to compress a bit better than JavaScript::Minifier::JS. But is also a lot slower, so not really useful when devving. Stats for en.js: raw gzip uglifyjs 68199 19446 JS::Minifier::XS 79862 21624 Uncompressed 107662 28663 On an unrelated note, I like how jQuery boasts about being "Only 32kB minified and gzipped.". That's quite a bit more than all of VNDB's Javascript combined. For a damn library.
2015-08-17js: Add L10N strings to all relevant varsYorhel1-8/+8
This simplifies the JS code in some places and removes a whole number of L10N strings from the "l10n_str" var, thus shrinking the JS size a bit (uncompressed about 1500 bytes, in fact. 500 bytes after gzip).
2015-08-15js: Let preprocess L10N strings + add L10N strings to some varsYorhel1-18/+48
This simplifies the JS version of mt() a bit and makes the whole internationalization framework a bit more robust. I also changed the VARS.{rlist_status,age_ratings,languages,platforms,char_roles} arrays to include the L10N string. This simplifies the JS code and reduces the JS size. There's a few more of such lists that can be transformed in the same way, I'll get to that later.
2015-08-15js: Wrap included files in anonymous functionYorhel1-1/+1
This removes the need to indent all files and add the anonymous function manually, and it also provides clean and consistent semantics. I already rewrote the library-like files earlier on to add their public interfaces to the window object, so everything should keep working after this change. It's still possible that some files use use a function from another non-library file. Those will break, but I'm sure such cases will be found soon enough, if they exist.