Sal

joined 2 years ago
MODERATOR OF
[–] Sal@mander.xyz 13 points 2 months ago (4 children)

Check in your settings whether you have disabled the visibility of bot responses. This can happen if bots replied to you and your settings are set to not see them.

[–] Sal@mander.xyz 5 points 3 months ago

I did not know of the term "open washing" before reading this article. Unfortunately it does seem like the pending EU legislation on AI has created a strong incentive for companies to do their best to dilute the term and benefit from the regulations.

There are some paragraphs in the article that illustrate the point nicely:

In 2024, the AI landscape will be shaken up by the EU's AI Act, the world's first comprehensive AI law, with a projected impact on science and society comparable to GDPR. Fostering open source driven innovation is one of the aims of this legislation. This means it will be putting legal weight on the term “open source”, creating only stronger incentives for lobbying operations driven by corporate interests to water down its definition.

[.....] Under the latest version of the Act, providers of AI models “under a free and open licence” are exempted from the requirement to “draw up and keep up-to-date the technical documentation of the model, including its training and testing process and the results of its evaluation, which shall contain, at a minimum, the elements set out in Annex IXa” (Article 52c:1a). Instead, they would face a much vaguer requirement to “draw up and make publicly available a sufficiently detailed summary about the content used for training of the general-purpose AI model according to a template provided by the AI Office” (Article 52c:1d).

If this exemption or one like it stays in place, it will have two important effects: (i) attaining open source status becomes highly attractive to any generative AI provider, as it provides a way to escape some of the most onerous requirements of technical documentation and the attendant scientific and legal scrutiny; (ii) an as-yet unspecified template (and the AI Office managing it) will become the focus of intense lobbying efforts from multiple stakeholders (e.g., [12]). Figuring out what constitutes a “sufficiently detailed summary” will literally become a million dollar question.

Thank you for pointing out Grayjay, I had not heard of it. I will look into it.

 

Cross-posting to the OpenSource community as I think this topic will also be of interest here.

This is an analysis of how "open" different open source AI systems are. I am also posting the two figures from the paper that summarize this information below.

ABSTRACT

The past year has seen a steep rise in generative AI systems that claim to be open. But how open are they really? The question of what counts as open source in generative AI is poised to take on particular importance in light of the upcoming EU AI Act that regulates open source systems differently, creating an urgent need for practical openness assessment. Here we use an evidence-based framework that distinguishes 14 dimensions of openness, from training datasets to scientific and technical documentation and from licensing to access methods. Surveying over 45 generative AI systems (both text and text-to-image), we find that while the term open source is widely used, many models are ‘open weight’ at best and many providers seek to evade scientific, legal and regulatory scrutiny by withholding information on training and fine-tuning data. We argue that openness in generative AI is necessarily composite (consisting of multiple elements) and gradient (coming in degrees), and point out the risk of relying on single features like access or licensing to declare models open or not. Evidence-based openness assessment can help foster a generative AI landscape in which models can be effectively regulated, model providers can be held accountable, scientists can scrutinise generative AI, and end users can make informed decisions.

Figure 2 (click to enlarge): Openness of 40 text generators described as open, with OpenAI’s ChatGPT (bottom) as closed reference point. Every cell records a three-level openness judgement (✓ open, ∼ partial or ✗ closed). The table is sorted by cumulative openness, where ✓ is 1, ∼ is 0.5 and ✗ is 0 points. RL may refer to RLHF or other forms of fine-tuning aimed at fostering instruction-following behaviour. For the latest updates see: https://opening-up-chatgpt.github.io

Figure 3 (click to enlarge): Overview of 6 text-to-image systems described as open, with OpenAI's DALL-E as a reference point. Every cell records a three-level openness judgement (✓ open, ∼ partial or ✗ closed). The table is sorted by cumulative openness, where ✓ is 1, ∼ is 0.5 and ✗ is 0 points.

There is also a related Nature news article: Not all ‘open source’ AI models are actually open: here’s a ranking

PDF Link: https://dl.acm.org/doi/pdf/10.1145/3630106.3659005

[–] Sal@mander.xyz 8 points 3 months ago

Thank you being around, bringing this nice community here, and helping with the federation!! 😁

[–] Sal@mander.xyz 2 points 3 months ago

Jajaja, sí, soy Mexicano 😁

[–] Sal@mander.xyz 7 points 3 months ago (2 children)

🥳 Muchas gracias!

[–] Sal@mander.xyz 3 points 3 months ago (2 children)

I find it satisfying to see the graph come down :)

[–] Sal@mander.xyz 5 points 3 months ago* (last edited 3 months ago) (4 children)

Yes, sorry, there was some serious lagg in fetching posts from Lemmy World that persisted for several days and accumulated a 1-week delay.

But after upgrading Mander it is now fetching data from LW quite rapidly and it should be back in-sync in about a day and a half from now.

If you are curious about the ranking algorithm, there is some info here: https://join-lemmy.org/docs/contributors/07-ranking-algo.html

[–] Sal@mander.xyz 2 points 3 months ago

Amazing work! Thanks a lot!! Took me a few days to get to it but I have upgraded now and it looks great 😄

[–] Sal@mander.xyz 34 points 6 months ago (1 children)

If the timing is right, I would bring a mushroom grow bag with mushrooms sprouting.

If not... probably my radiacode gamma spectrometer and some of my radioactive items. Maybe a clock with radium painted dials and a piece of trinitite. I think that there are many different points of discussion that can be of interest to a broad audience (radioactivity, spectroscopy, electronics, US labor law story of the radium girls, nuclear explosions, background radiation.... etc). As a bonus I can bring a UV flash light and show the radium fluorescence. Adults love UV flash lights.

[–] Sal@mander.xyz 28 points 6 months ago (1 children)

First of all, congratulations for bringing a baby girl into this world!! You must be really excited! I am very happy for you!

This looks very cool. I set up a wiki (https://ibis.mander.xyz/) and I will make an effort to populate it with some Lemmy lore and interesting science/tech 😄 Hopefully I can set some time aside and help with a tiny bit of code too.

[–] Sal@mander.xyz 8 points 6 months ago

Thank you for the positivity 💚 I wholeheartedly agree!

Drama and negativity drives engagement, and this form of engagement can easily trigger a feedback loop in which negativity keeps piling on and voices of support are practically muted.

We are participating in an open source project that has some very ambitious goals. Things can be messy, mistakes happen, there are risks, and people have many different opinions and moods. Heated discussions can be a healthy part of the process. But, once the dust is allowed to settle for a bit, it is good to remember that we are humans and that we are here because we have some shared goals.

I think the majority of people around here are kind and have a positive outlook, but perhaps it is more motivating to speak out when we have negative comments than positive ones. So, thank you for taking the time to write this positive message!

[–] Sal@mander.xyz 10 points 7 months ago (1 children)

I am also quite interested in this. It is not something that keeps me awake at night, and I am not particularly paranoid about it. But I find that working towards answering this question is a fun frame from which to learn about electronics, radio communications, and networking.

Since this appears to be something that is causing you some anxiety, I think it is better if I start by giving you some reassurance in that I have not yet managed to prove that any electronic device is spying on me via a hidden chip. I don't think it is worth being paranoid about this.

I can explain some things that could be done to test whether a Linux computer spying. I am not suggesting that you try any of this. I am explaining this to you so that you can get some reassurance in the fact that, if devices were spying on us in this manner, it is likely that someone would have noticed by now.

The "spy" chip needs some way to communicate. One way a chip might communicate is via radio waves. So, the first step would be to remove the WiFi and Bluetooth dongles and any other pieces of hardware that may emit radio waves during normal operation. There is a tool called a "Spectrum Analyzer" that can be used to capture the presence of specific radio frequencies. These devices are now relatively affordable, like the tinySA, which can measure the presence of radio frequencies of up to 6 GHz.

One can make a Faraday cage, for example, by wrapping the PC with a copper-nickel coated polyester fabric to isolate the PC from the radio waves that are coming from the environment. The spectrum analyzer antennas can be placed right next to the PC and the device is left to measure continuously over several days. A script can monitor the output and keep a record of any RF signals.

Since phones are small, it is even easier to wrap them in the copper-nickel polyester fabric alongside with the spectrum analyzer antenna to check whether they emit any RF when they are off or in airplane mode with the WiFi and Bluetooth turned off.

What this experiment may allow you to conclude is that the spy chip is not communicating frequently with the external world via radio frequencies, at least not with frequencies <= 6 GHz.

Using frequencies higher 6 GHz for a low-power chip is not going be an effective method of transmitting a signal very far away. The chip could remain hidden and only emit the signal under certain rare conditions, or in response to a trigger. We can't rule that out with this experiment, but it is unlikely.

A next step would be to test a wired connection. It could be that the spy chip can transmit the data over the internet. One can place a VPN Gateway in between their PC and the router, and use that gateway to route all the traffic to their own server using WireGuard. All network packets that leave through the PC's ethernet connection can be captured and examined this way using Wireshark or tcpdump.

If one can show that the device is not secretly communicating via RF nor via the internet, I think it is unlikely that the device is spying on them.

 

I was looking at my /var/log/auth.log in my personal computer and VPS, and I can see thousands of failed SSH attempts over the past few days. Looking at the attempted logins, I suppose that someone is using a database and trying out common default username/password combinations to attack random IP addresses. I also see that they try this for many different ports.

This approach of attack appears to me to be very very very unlikely to return anything of value. They may as well just try generating bitcoin private keys randomly until they find a wallet with something in it.

Are these 'hackers' just playing the lottery and wasting their resources? Or is this a strategy that somehow works reasonably often?

view more: next ›