acoustsee

USAGE

Web Application Interface

The web application features a user-friendly interface divided into five interactive regions, designed to facilitate seamless control and customization of the synesthetic audio experience. All camera processing is performed locally on your device, ensuring privacy. No frames are transmitted externally, though the browser will request camera access permission to enable this local processing for audio generation.

Interface Regions

Center Rectangle: Audio Enabler
A touch-sensitive area that activates the webpage’s audio output, allowing sound generation to begin.
Top Border Rectangle: Settings SHIFTer Button
Toggles the settings mode to reveal advanced configuration options.
Bottom Rectangle: Start/Stop Button
Initiates or pauses the audio generation and camera processing.
Left Rectangle: Day/Night Switch
Inverts light logic to optimize visibility and processing for different lighting conditions.
Right Rectangle: Language Switcher
Changes the interface language for improved accessibility.

Settings Mode (SHIFTed Interface)

When settings are enabled via the SHIFTer button:

SHIFTed Left Rectangle: Grid Selector
Adjusts the camera’s framing or “gridding” of the environment, allowing users to customize how the visual input is segmented for audio mapping.
SHIFTed Right Rectangle: Audio Engine Selector
Modifies the sound synthesizer’s response to the selected grid, enabling users to tailor the audio output to their preferences.

The latest stable version is hosted at:

https://mamware.github.io/acoustsee/present

Browser compability list:

Browser	Minimum Version for Full Support	Notes
Chrome for Android	Chrome 47 (December 2015)	Full support for getUserMedia, AudioContext, and createStereoPanner.
Safari on iOS	iOS 14.5 (Safari 14.1, April 2021)	Supports unprefixed AudioContext and createStereoPanner. No vibration support.
Firefox for Android	Firefox 50 (November 2016)	Full support for all APIs, though SpeechSynthesis may be inconsistent.
Samsung Internet	Samsung Internet 5.0 (2017)	Based on Chromium, full support for all APIs.
Opera Mobile	Opera 36 (2016)	Based on Chromium, full support for all APIs.
Edge for Android	Edge 79 (January 2020)	Based on Chromium, full support for all APIs.

To test our first commit wich is a Python script, either out of curiosity or educational purposes, follow the instrucctions below

How to run the first iteration, a simple proof-of-concept processing a static image file and output basic left/right panned audio file.

Setup

Clone the Repo:

   git clone https://github.com/MAMware/acoustsee.git
   cd acoustsee

Set Up Virtual Environment:

  python3 -m venv acoustsee_env
  source acoustsee_env/bin/activate

Install Dependencies: bash pip install opencv-python-headless numpy scipy pyo Run the MVP: For local machines bash python src/main.py For headless environments (e.g., Codespaces):

  python src/main_codespaces.py

Try it with examples/wall_left.jpg to hear a basic left/right audio split!

Troubleshooting the python version installation

Windows pyo Installation:
- Use Python 3.11 or 3.12 for best compatibility.
- Install Microsoft Visual C++ Build Tools: Download.
- Ensure PortAudio is installed and in your PATH.
- Example:
```
python3.11 -m venv acoustsee_env
.\acoustsee_env\Scripts\activate
pip install opencv-python numpy scipy pyo
```

Linux pyo Installation (e.g., GitHub Codespaces):

Use a virtual environment:

python3 -m venv acoustsee_env
source acoustsee_env/bin/activate

Install development libraries:

sudo apt update
sudo apt install -y libportaudio2 portaudio19-dev libportmidi-dev liblo-dev libsndfile1-dev libasound-dev libjack-dev build-essential libgl1-mesa-glx

Install Python dependencies:

pip install opencv-python-headless numpy scipy pyo

If opencv-python fails with libGL.so.1 errors, use opencv-python-headless:

pip uninstall -y opencv-python
pip install opencv-python-headless

If Python 3.12 fails, try Python 3.11:

sudo apt install -y python3.11 python3.11-venv
python3.11 -m venv acoustsee_env
source acoustsee_env/bin/activate
pip install opencv-python-headless numpy scipy pyo

Headless Environments (e.g., Codespaces):
- Codespaces lacks audio output. Use main_codespaces.py to generate WAV files:
```
python src/main_codespaces.py
```
- Download examples/output.wav via the Codespaces file explorer and play locally.
- Example WAV test:
```
from pyo import *
s = Server(audio="offline").boot()
s.recordOptions(dur=2, filename="test.wav")
sine = Sine(freq=440, mul=0.5).out()
s.start()
s.stop()
```
WxPython/Tkinter Warning:
- pyo may warn about missing WxPython, falling back to Tkinter. This is harmless for WAV generation.
SetuptoolsDeprecationWarning:
- A warning about License :: OSI Approved :: GNU General Public License is harmless (it’s a pyo packaging issue).

Privacy and Processing The application processes all camera data locally on your device, ensuring no visual information leaves your processor. Upon launching, the browser will request camera access to perform this private processing, which is essential for generating the real-time audio cues used for navigation.

Still stuck? Open an issue on GitHub or ping us on X.

This site is open source. Improve this page.