131 lines
4.2 KiB
Markdown
131 lines
4.2 KiB
Markdown
# Book Scanner Setup Guide
|
|
|
|
The book scanning feature allows users to scan book covers using their device camera and automatically extract book information using Google Gemini AI.
|
|
|
|
## Prerequisites
|
|
|
|
1. **Google Gemini API Key**: Get your free API key from [Google AI Studio](https://makersuite.google.com/app/apikey)
|
|
2. **Device with camera**: The feature requires a camera (front or back)
|
|
3. **Camera permissions**: Users must grant camera access when prompted
|
|
|
|
## Setup Instructions
|
|
|
|
### 1. Add Your Gemini API Key
|
|
|
|
Edit the `lib/config/api_config.dart` file and replace the placeholder:
|
|
|
|
```dart
|
|
class ApiConfig {
|
|
// TODO: Replace with your actual Gemini API key
|
|
static const String geminiApiKey = 'YOUR_GEMINI_API_KEY_HERE';
|
|
}
|
|
```
|
|
|
|
Replace `YOUR_GEMINI_API_KEY_HERE` with your actual Google Gemini API key.
|
|
|
|
### 2. Permissions
|
|
|
|
The app automatically requests camera permissions. However, you may need to configure platform-specific settings:
|
|
|
|
#### Android
|
|
- Camera permissions are already configured in `android/app/src/main/AndroidManifest.xml`
|
|
- No additional setup required
|
|
|
|
#### iOS
|
|
- Camera usage description is configured in `ios/Runner/Info.plist`
|
|
- The app will request camera permission when first launched
|
|
|
|
## How It Works
|
|
|
|
1. **Camera Preview**: The scanner screen shows a live camera preview with a scanning frame
|
|
2. **Capture**: Users tap the capture button to take a photo of the book cover
|
|
3. **AI Analysis**: The image is sent to Google Gemini AI for analysis
|
|
4. **Book Extraction**: Gemini extracts:
|
|
- Book title
|
|
- Author name
|
|
- Genre (categorized into: fiction, fantasy, science, detective, biography, other)
|
|
- Annotation/description
|
|
5. **Auto-fill**: The extracted information automatically fills the book form
|
|
|
|
## Usage
|
|
|
|
1. Open the "Add Book" screen
|
|
2. Tap the camera/scanner area
|
|
3. Grant camera permissions if prompted
|
|
4. Position the book cover within the scanning frame
|
|
5. Ensure the text is clearly visible and readable
|
|
6. Tap the capture button (large white circle)
|
|
7. Wait for the AI analysis (2-5 seconds)
|
|
8. Review and edit the auto-filled information if needed
|
|
9. Save the book
|
|
|
|
## Tips for Better Scanning
|
|
|
|
- Ensure good lighting
|
|
- Hold the device steady
|
|
- Position the book cover within the green scanning frame
|
|
- Make sure text is not blurred or obscured
|
|
- Use high contrast books (avoid glare or reflections)
|
|
- Try different angles if the first scan doesn't work
|
|
|
|
## Troubleshooting
|
|
|
|
### Camera not working
|
|
- Check if camera permissions are granted
|
|
- Close other apps that might be using the camera
|
|
- Restart the app
|
|
|
|
### Scanning fails or produces incorrect results
|
|
- Ensure the book cover text is clearly visible
|
|
- Try scanning in better lighting conditions
|
|
- Some covers with complex designs may be harder to recognize
|
|
- You can always manually edit the extracted information
|
|
|
|
### API errors
|
|
- Verify your Gemini API key is correctly configured
|
|
- Check your internet connection
|
|
- Ensure you have available API quota (free tier is generous)
|
|
|
|
## Technical Details
|
|
|
|
### Services Created
|
|
|
|
1. **CameraService** (`lib/services/camera_service.dart`)
|
|
- Manages camera initialization and lifecycle
|
|
- Handles permissions
|
|
- Provides image capture functionality
|
|
|
|
2. **GeminiService** (`lib/services/gemini_service.dart`)
|
|
- Integrates with Google Gemini AI
|
|
- Processes book cover images
|
|
- Extracts structured book metadata
|
|
- Handles error cases gracefully
|
|
|
|
### Dependencies Added
|
|
|
|
- `camera: ^0.11.1` - Camera functionality
|
|
- `google_generative_ai: ^0.4.6` - Gemini AI integration
|
|
- `permission_handler: ^11.0.0` - Permission management
|
|
|
|
### Privacy & Security
|
|
|
|
- Images are sent to Google's servers for AI analysis
|
|
- Temporary images are deleted after processing
|
|
- API keys should be kept secure and not committed to version control
|
|
- Consider using environment variables for API keys in production
|
|
|
|
## Cost Considerations
|
|
|
|
- Google Gemini API has a generous free tier
|
|
- Typical book scan uses minimal tokens
|
|
- Monitor your API usage in the Google Cloud Console if needed
|
|
|
|
## Future Enhancements
|
|
|
|
Potential improvements to consider:
|
|
- Barcode/ISBN scanning as alternative
|
|
- Offline scanning capability
|
|
- Batch scanning for multiple books
|
|
- Image quality enhancement before sending to AI
|
|
- Support for multiple languages
|
|
- Custom AI prompts for better recognition |