this post was submitted on 17 Jul 2023
37 points (100.0% liked)

Technology

37699 readers
271 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS
 

EDIT: With thanks to u/code@lemmy.mayes.io, I have a solution for cleaning up the one table that is growing ridiculously in size and that is the activity table. A TRUNCATE activity; command cleaned that right up. A word of caution is in order because the Lemmy instance must be shutdown completely and only the postgres container running if you're using docker.

As a new Lemmy admin, I think I really need to learn PostgreSQL administration because I need a way to keep my database from ballooning in size and there are no management tools built yet for it. Would someone be so kind as to recommend a good website for learning PostgreSQL? This would also help me maintain my mastodon instance.

top 11 comments
sorted by: hot top controversial new old
[–] Penguincoder@beehaw.org 10 points 1 year ago* (last edited 1 year ago)

learn PostgreSQL administration

Definitely helpful, but administration only goes so far with the Lemmy database. Take a look at this post and let me know if it answers any of your questions; if you have more feel free to ask, or ping me on matrix @penguincoder:hive.beehaw.org

[–] admin@beehaw.org 10 points 1 year ago (1 children)

I'll ping @Penguincoder@beehaw.org since they would know a lot more about this.

[–] housepanther@lemmy.goblackcat.com 5 points 1 year ago (1 children)
[–] code@lemmy.mayes.io 5 points 1 year ago (4 children)

Look at vacumn command. Also dbeaver is a teally good gui tool. Right now there is one table thats the main cause (i think its called activity but dont quote me on that there was a post about it)

That table is mainly for debugging as it logs every action in activitypub your instance does. I truncate it once a week and vacumn it (shrinks the disk space used). Caution. You must shut down lemmy before doing so. I run containers so i stop all except postgres and clean it up.

[–] phoenix591@lemmy.phoenix591.com 3 points 1 year ago* (last edited 1 year ago)

looks like the devs are on it.

After that commit my database dump went from around 1.1G to 260M

[–] Penguincoder@beehaw.org 1 points 1 year ago

You really don't want to do a vacuum full on the live DB, it will lock everything. While useful and necessary I think, you have got to plan down time for it. Or your site is going to be inaccessible anyway.

[–] housepanther@lemmy.goblackcat.com 1 points 1 year ago (1 children)

You are exactly correct about that one table. It is like 90% of the space. When I ran the vacuum I did not shutdown Lemmy. Now I know.

[–] Penguincoder@beehaw.org 1 points 1 year ago

Really ugly, but really works. Connect to psql and run:

WITH RECURSIVE pg_inherit(inhrelid, inhparent) AS
    (select inhrelid, inhparent
    FROM pg_inherits
    UNION
    SELECT child.inhrelid, parent.inhparent
    FROM pg_inherit child, pg_inherits parent
    WHERE child.inhparent = parent.inhrelid),
pg_inherit_short AS (SELECT * FROM pg_inherit WHERE inhparent NOT IN (SELECT inhrelid FROM pg_inherit))
SELECT table_schema
    , TABLE_NAME
    , row_estimate
    , pg_size_pretty(total_bytes) AS total
    , pg_size_pretty(index_bytes) AS INDEX
    , pg_size_pretty(toast_bytes) AS toast
    , pg_size_pretty(table_bytes) AS TABLE
    , total_bytes::float8 / sum(total_bytes) OVER () AS total_size_share
  FROM (
    SELECT *, total_bytes-index_bytes-COALESCE(toast_bytes,0) AS table_bytes
    FROM (
         SELECT c.oid
              , nspname AS table_schema
              , relname AS TABLE_NAME
              , SUM(c.reltuples) OVER (partition BY parent) AS row_estimate
              , SUM(pg_total_relation_size(c.oid)) OVER (partition BY parent) AS total_bytes
              , SUM(pg_indexes_size(c.oid)) OVER (partition BY parent) AS index_bytes
              , SUM(pg_total_relation_size(reltoastrelid)) OVER (partition BY parent) AS toast_bytes
              , parent
          FROM (
                SELECT pg_class.oid
                    , reltuples
                    , relname
                    , relnamespace
                    , pg_class.reltoastrelid
                    , COALESCE(inhparent, pg_class.oid) parent
                FROM pg_class
                    LEFT JOIN pg_inherit_short ON inhrelid = oid
                WHERE relkind IN ('r', 'p')
             ) c
             LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
  ) a
  WHERE oid = parent
) a
ORDER BY total_bytes DESC LIMIT 3;

That will show the top 3 database tables sizes. I bet you number one is activity.

Yeah, that worked magically!

[–] linuxguy@lemmy.gregw.us 4 points 1 year ago

Why are you concerned with your database ballooning?

What do you think you'll be able to do to the database to prevent this?

What do you want to do?

load more comments
view more: next ›